I am trying to use regex to find specific texts and replace them with an image but I am not getting the expected results. I have attached the test document (Test.docx) and the resultant document (TestResult.docx) after replacement. It looks like the regex is finding all the text between the first <#inf and the final #> as opposed to finding 5 separate instances of the text between <#inf and #> (highlighted lines in the test document) and replacing them with 5 images (have attached TestExpectation.docx, which is what I am expecting).
Here’s the c# code I am using:
Document doc = new Document(@"C:\Temp\Test.docx");
FindReplaceOptions options = new FindReplaceOptions();
options.MatchCase=false;
options.ReplacingCallback = new FindAndInsertImage(options);
var regex = new Regex(@"<#inf\s[a-zA-Z]+.*#>", RegexOptions.IgnoreCase);
doc.Range.Replace(regex, String.Empty, options);
doc.Save(@"C:\Temp\TestResult.docx");
private class FindAndInsertImage : IReplacingCallback
{
internal FindAndInsertImage(FindReplaceOptions options)
{
mOptions = options;
}
//This simplistic method will only work well when the match starts at the beginning of a run.
ReplaceAction IReplacingCallback.Replacing(ReplacingArgs args)
{
DocumentBuilder builder = new DocumentBuilder((Document)args.MatchNode.Document);
builder.MoveTo(args.MatchNode);
var shape = builder.InsertImage(File.ReadAllBytes(@"c:\temp\barcode1.png"));
return ReplaceAction.Replace;
}
private readonly FindReplaceOptions mOptions;
}
Using the regex to test on https://regex101.com/ shows 5 separate matches as I expected.
I have also tried the following regex:
- <#inf\sBarcode\s[a-zA-Z]+.*#>
- <#inf .*#>
TestResult.docx (61.5 KB)
Test.docx (12.6 KB)
TestExpectation.docx (67.3 KB)