Search PDF Contents Performance

I am able to get the the pdf search capability working in .Net using the examples provided. However, I will be having over 1,000 pdf on average 50 pages long each. The search takes many minutes. Is there an optimized search so it can run much faster? Maybe some sort of indexing?

@polarlight

Thanks for contacting support.

You may lower memory consumption by absorbing/searching text page by page. Please check following code snippet which you can use as per your program requirements:

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document((dataDir + "input.pdf")); 
foreach (Aspose.Pdf.Page page in pdfDocument.Pages)
{
 Aspose.Pdf.Text.TextFragmentAbsorber TextFragmentAbsorberAddress = new Aspose.Pdf.Text.TextFragmentAbsorber();
 page.Accept(TextFragmentAbsorberAddress);
 ///
 // perform some stuff
 ///
 page.Dispose();
}
pdfDocument.Dispose();

In case you still face any issue, please share your sample PDF document with us along with complete working sample code. We will test the scenario in our environment and address it accordingly.