Want to replace Text in PDF File

Hi,

We are using following code to determine the curly braces in the pdf document.
This is required as we need to replace the text between the curly braces.

TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("{(.*?)}"); //like 1999-2000
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;
pdfDocument.Pages.Accept(textFragmentAbsorber);
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;
//loop through the fragments
foreach (TextFragment textFragment in textFragmentCollection)
{
Console.WriteLine("Text : {0} ", textFragment.Text);
}


This works as expected when the open and close curly braces ({}) are in same line in cell of the table.
If they wrap in the cell, the code is not able to identify the close part of curly braces.

Attached is the sample document for your reference.

The code works for the first row, but does not work for the second row in the attached document

Hi Mukul,

Thanks for your inquiry. I have tested your scenario with shared document using Aspose.Pdf for .NET 10.4.0 and managed to observe the reported issue. For further investigation, I have logged an issue in our issue tracking system as PDFNEWNET-38761 and also linked your request to it. We will keep you updated via this thread regarding the issue status.

Please feel free to contact us for any further assistance.

Best Regards

Hi,

Please do let us know if any progress on this issue.

Thanks,

Hi Mukul,


Thanks for your inquiry. I am afraid your issue is not resolved. As we have recently noticed the issue, it is pending for investigation due to the other issues, already under investigation and resolution. We will let you know as soon a we made some significant progress towards issue resolution.

We are sorry for the inconvenience caused.

Best Regards,

Hi,


Please let us know if any update on this.

We need to implement a feature where this is important.

Thank,

Hi Mukul,


Thanks for your inquiry. I am afraid issue is still pending for investigation. Currently product team is busy to resolve other priority issues as we address issues on first come first basis. We will keep you updated about the issue resolution progress.

Thanks for your patience and cooperation.

Best Regards,

Hi,


Please let us know if any update on this.

Thanks,

Mukul

Hi Mukul,


Thanks for your inquiry. I am afraid your reported issue is still not resolved, as product team is busy in resolving other issues, reported earlier. However we have raised the issue priority in our issue tracking system, requested our team to investigate the issue and share an ETA at their earliest. We will notify you as soon as we made some significant progress towards issue resolution.

Thanks for your patience and cooperation.

Best Regards,

Hi Mukul,

Thanks for your patience. We have investigated the issue and found that there are two problems in regular expression that you are using. First is that ‘.’ in regular expression matches any single character except ‘\n’. (https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx) But ‘\n’ is present in text spanned over multiple lines. Therefore right expression between curly braces will be ‘(.|\n)?’.

Second is a minor problem. Double curve braces are used in the document. You used lazy matching (’?’ quantifier) between single curve braces. Therefore text ending with double curly braces is not found. Please use following code snippet it will help you to accomplish the task.

Document doc = new
Document(myDir + “Test7.pdf”);<o:p></o:p>

TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("{{(.|\n)*?}}");

TextSearchOptions textSearchOptions = new TextSearchOptions(true);

textFragmentAbsorber.TextSearchOptions = textSearchOptions;

doc.Pages.Accept(textFragmentAbsorber);

TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

//loop through the fragments

foreach (TextFragment textFragment in textFragmentCollection)

{

Console.WriteLine("Text : {0} ", textFragment.Text);

}

Best Regards,