doc.Pages.Accept(textFragmentAbsorber) is taking too long to execute

Hi All,

We are trying get all occurances based in Regex match. We are using Aspose.PDF 23.11.1 latest version. Below is the code to get TextFragments.

var doc = new Aspose.Pdf.Document(path);

var searchTerm = “\[(?s)(.*?)\]”;
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber(searchTerm);
TextSearchOptions textSearchOptions = new TextSearchOptions(true);
textFragmentAbsorber.TextSearchOptions = textSearchOptions;

doc.Pages.Accept(textFragmentAbsorber);

TextFragmentCollection textFragmentCollection = textFragmentAbsorber.TextFragments;

It is taking more than 15-20 minutes to get fragments even for small file.
Can you please suggest any approach to improve the performance.

Thanks,
Susmitha

@susmithaputhana

This needs to be investigated. Can you please share the sample PDF for our reference?

Attached PDF for your reference
TestPdf.7z (8.8 MB)

@susmithaputhana

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-56170

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.