Thanks again for your quick response. Your comments are really helpful. Sorry to keep asking more questions but I would really like to use your product if I can figure out how to do the same things we are doing with Word COM Automation.
For Item 1), thanks - this looks straightforward enough so I will give that a try
For Item 4), thanks - I realized I was calling the replace evaluator with the false parameter for the direction so it wasn’t searching forward, duh.
For Item 2), my text file does already contain the “\n” characters in it but it does not seem to be replacing them with actual paragraph marks. The code I tried is as follows:
The code below is from the “LoadTxt” example on your Web Site.
using (StreamReader reader = new StreamReader(dataDir + "sdnlist.txt", Encoding.UTF8))
{
// Read plain text "lines" and convert them into paragraphs in the document.
while (true)
{
string line = reader.ReadLine();
if (line != null)
builder.Writeln(line);
else
break;
}
}
I also tried the following:
using (StreamReader reader = new StreamReader(dataDir + "sdnlist.txt", Encoding.UTF8))
{
// Read plain text
while (true)
{
string text = reader.ReadToEnd();
builder.Write(text);
break;
}
}
However, neither of these seem to convert the “\n” to a paragraph break. In fact, if you look at the text after it has been loaded using the Document.Range method, I noticed that the “\n” characters have actually been converted to “\n”, i.e.an additional backslash(), has been prepended.The only way I seem to be able to get paragraph breaks in the document is to use the following:
sDoc.Range.Replace(new Regex(@"\\n"), new ReplaceEvaluator(ParagraphEvaluator), false);
}
private static ReplaceAction ParagraphEvaluator(object sender, ReplaceEvaluatorArgs e)
{
e.Replacement = ControlChar.ParagraphBreak;
return ReplaceAction.Replace;
}
But the document is fairly large and it takes over 5 minutes to go through and put the paragraph breaks in. It would seem that since the “\n” is already in the text document when it is loaded that I shouldn’t need to do this step.
For Item 3), I did try the regular expression that you mentioned but the problem (and it may be related to Item 2) above, is that it is matching the text from the very first occurrance of the “|” to the very last occurrance of the “|” instead of finding each “pair”. For example, given the following text:
|This should be bolded|This should not be bolded|This should be bolded again|
It is returning the entire string as a match. What I want is actually 2 separate matches, i.e.
|This should be bolded|
|This should be bolded again|
but not the text “This should not be bolded”
One other interesting thing that may be related to the paragraph problem is that when I get in the ReplaceEvaluator method and look at the value of e.MatchNode, it is returning the entire document. Is that because the document is just one huge paragraph? I’m thinking that if I can get the document actually broken into paragrpahs before I start matching text then it might help things.
I’m also a little unclear on the concept of the “Run”, since my text is all basically formatted, it seems that I only have 1 run, which is the entire document. It seems like I need to be able to extract text between my markers, make that a “run”, and then format that run. Is that correct?
Thanks again for all of your help. Your product looks very nice and I hope we can get past these last few issues.