Hi Mukul,
Thanks for your inquiry. I have tested your scenario with shared document using Aspose.Pdf for .NET 10.4.0 and managed to observe the reported issue. For further investigation, I have logged an issue in our issue tracking system as PDFNEWNET-38761 and also linked your request to it. We will keep you updated via this thread regarding the issue status.
Please feel free to contact us for any further assistance.
Best Regards
Hi,
Hi Mukul,
Hi,
Hi Mukul,
Hi,
Hi Mukul,
Hi Mukul,
Thanks for your patience. We have investigated the issue and found that there are two problems in regular expression that you are using. First is that ‘.’ in regular expression matches any single character except ‘\n’. (https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
) But ‘\n’ is present in text spanned over multiple lines. Therefore right expression between curly braces will be ‘(.|\n)?’.
Second is a minor problem. Double curve braces are used in the document. You used lazy matching (’?’ quantifier) between single curve braces. Therefore text ending with double curly braces is not found. Please use following code snippet it will help you to accomplish the task.
Document doc = new
Document(myDir + “Test7.pdf”);<o:p></o:p>
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("{{(.|\n)*?}}");
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
doc.Pages.Accept(textFragmentAbsorber);
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
//loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
Console.WriteLine("Text : {0} ", textFragment.Text);
}
Best Regards,