PDF search - get the page number of found textFragment

Hi,
I am currently trying to highlight some words in a large PDF file. Since just setting textFragment.TextState.BackgroundColor = System.Drawing.Color.Yellow;
does not work in all cases, I also need to put an annotation in the PDF.
How can I retrieve the page where the textFragment was found, so I can put an annotation in that place?

Thank you

Hi Tomas,

Thanks for using our products.

Currently there is not such method or property available in TextFragment class which can return the value of page number over which the particular Text string is present. However as a workaround, you may consider traversing through each page of the PDF document, and wherever the string is present, you can change the BackgroundColor and add the annotation on same place.

In case I have not properly understood your requirement or you have any further query, please feel free to contact. We are sorry for your inconvenience.

Hi,

thank you for your answer.
Searching each page individually would not find phrases which begin at one page and continue on the next one. But it seems to me, that this is a bug in Aspose.PDF and the search doesnt find such phrases even when using

document.Pages.Accept(absorber);

so I can use the solution you proposed.
tomasgrosup1:
Searching each page individually would not find phrases which begin at one page and continue on the next one. But it seems to me, that this is a bug in Aspose.PDF and the search doesnt find such phrases even when using

document.Pages.Accept(absorber);

so I can use the solution you proposed.
Hi Tomas,

Can you please share the sample PDF document and code snippet which can help us in replicating this issue. We are sorry for this inconvenience.

Hi,


here is the PDF file and the code:

var regex = @"((?i)(Hello([\s]+)World))";
var document = new Document(pdf);
var absorber = new TextFragmentAbsorber(regex, new TextSearchOptions(isRegularExpressionUsed: true));

document.Pages.Accept(absorber);

var fragmets = absorber.TextFragments;


The fragmens collection contains 0 fragments after running the code.
The next is not found if “Hello” is on page 1 and “World” on page 2. If they are on the same page, the regex finds it correctly.

Hi Tomas,


Thanks for sharing the resource file and code snippet.

I
have tested the scenario and I am able to reproduce the same problem. For the
sake of correction, I have logged it in our issue tracking system as
PDFNEWNET-33948. We will investigate this
issue in details and will keep you updated on the status of a correction.

We apologize for your inconvenience.