Search Text Spanning multiple PDF Pages

Hello Aspose Team,

We are using TextFragmentAbsorber to search for specific phrases in PDFs. However, we have a scenario where the text we are searching for is split across two pages — for example, “Service’s” appears at the end of page 1, and “architecture” appears at the top of page 2.
Currently, TextFragmentAbsorber appears to work on a per-page basis, which causes this case to be missed.
Could you please advise on:

  • Whether there is any built-in support for searching text fragments that span across multiple pages?
  • If not, is there any workaround or recommended approach using the Aspose API (e.g., merging text across pages and applying regex)?
  • If this feature is not supported, would it be possible to consider it as a feature request for upcoming releases?

@ritikrajjalu

Can you please clarify if you are looking for a specific code example or just general guidance on handling text fragments across multiple pages?

We are looking for a specific code example or an officially recommended approach to handle the following scenario:

We are using TextFragmentAbsorber to search for phrases in a PDF.

We need to search for phrases that may span across multiple pages, e.g., the word "Agreement" is at the bottom of page 1, and "avoid" is at the top of page 2.

Our goal is to extract such cross-page phrases as a valid TextFragment (or equivalent) with information about their position if possible.

single_page.pdf (129.9 KB)

@ritikrajjalu

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-60031

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.