Free Support Forum - aspose.com

Regex replace

You documentation states that an exception will be thrown if a captured text contains one or more special characters which include paragraph breaks.

Can you please tell me how I can find and repace text such as:


We want the signature to go here.¶

We can write a simple regex that will capture all text from the opening to the closing . However, Aspose will thrown an exception because of the paragraph marks that are found within the captured text.

I can write some special case maybe that will insure that this particular case does not include a paragraph mark. Nonetheless I cannot guarantee that other documents will never contain paragraph marks within the text.

Can you please tell me how I am supposed to handle search/replace where the search result may contain a paragraph break?

Thank you!


This message was posted using Aspose.Live 2 Forum

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your inquiry. I think, you can find paragraphs, which contains start tag and end tag. Then remove all content between these paragraphs and insert new content.

If you need, I can try to create sample code for you. If so, please attach your sample template here, I will be glad to help you.

Best regards.

Hello Alexey,

Attached is a *very* simplistic example of what i'm trying to do based on one of the examples in your toolkit.

In a nutshell, what I want to do is find "tags" that are typed into an RTF document and replace them with data. Obviously this can be done with OpenXML using Microsoft's tools. However, our desire is to provide a well-known tool (Microsoft Word) that allows users to create the template and save as RTF. Next, we will (hopefully using Aspose) open the template and replace the "tags" with real data.

The issue that I am running into with Aspose is related to the limitation where I cannot replace search results that contain "special characters." Referring to the first post I made in this thread you'll note that the following RegEx...

Regex regex = new Regex(@"<\s*(?[^<>/]*?)\s*(?::(?[^/]*))*\s*(?/)?\s*>(?:(?.*)(?<\s*/\s*\k\s*>))?", RegexOptions.Singleline | RegexOptions.Multiline | RegexOptions.IgnoreCase)

... will match the following text in the document...


We want the signature to go here.¶

Obviously that run of text includes 2 paragraph marks (not including the final mark) which are considered "special characters" and thus my replace throws an exception in Aspose. So, I am now trying to figure out a nice work around that will allow me to quickly identify "tags" in the RTF document and thus know where my replacement text goes.

I have experimented with creating a class derived from the Run node that I can just insert into the document structure in place of the "tags." However, I'm finding that simply using e.MatchNode.Parent.InsertBefore(new TagNode(), e.MatchNode) causes Aspose to throw a null reference exception. So, I'm still kind of stuck.

I've attached my project (not including my experiments with a class derived from the Run node) so you can see what I'm doing. You'll see at this point it is extremely simple.

I only have about 24 hours left to make a decission as to whether Aspose will work for us or not, so any advise you have is greatly appreciated. Please let me know if you need any further information and I will be happy to provide you with anything that I can.

Thanks for all of the help you have been to me.

- Drew

OK. I've done a little more experimenting and I think that the second set of exceptions I was getting are related to trying insert nodes into the DOM during processing in the regex delegate function. I did a quick test where I marked the start and end nodes of a text run that was found during the RegEx, but waited until the RegEx was complete to insert new nodes and it worked as expected.

I'll explore this route a little more and see if I can find a solution using this technic.

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for additional information. I created sample code for you. Please see the attached class. Here is how you can use it:

Document doc = new Document(@"Test001\OriginalTemplate.rtf");

TagsHelper helper = new TagsHelper(doc);

// Replace few self closing tags.

helper.ReplaceTag("ADTFULLNAME", "<?xml:namespace prefix = st2 ns = "urn:schemas-microsoft-com:office:smarttags" /><?xml:namespace prefix = st1 ns = "urn:schemas:contacts" />Alexey Noskov");

helper.ReplaceTag("ADTDATEOFBIRTH", "12/12/12");

helper.ReplaceTag("ADTAGE", "23");

helper.ReplaceTag("ADTSEX", "male");

// Replace few composite tags.

helper.ReplaceTag("ALLERGIES", "Chocolate\nApples\netc");

helper.ReplaceTag("PHYSICALEXAMINATION", "Test text");

helper.ReplaceTag("DEFAULT", "Here is some default test\nyou can insert multiline text here");

// Save output document.

doc.Save(@"Test001\out.doc");

I hope this could help you.

Best regards.