Aspose PDF is throwing 'System.OutOfMemoryException' while extracting text from larger PDFs(>60MB))


Libray : Aspose.pdf

version : 10.5.0.0

Issue :Aspose PDF is throwing 'System.OutOfMemoryException' while extracting text from larger PDFs(>60MB)). Also sometimes i am getting this randomly on different files if we run a set of such files in a loop. It looks like it is not freeing up memory properly.Below is the code snippet i used, also attached sample file

code snippet

//open document
Document pdfDocument = new Document("input.pdf"); //create TextAbsorber object to extract text TextAbsorber textAbsorber = new TextAbsorber(); //accept the absorber for all the pages pdfDocument.Pages.Accept(textAbsorber); //get the extracted text string extractedText = textAbsorber.Text; // create a writer and open the file TextWriter tw = new StreamWriter("extracted-text.txt"); // write a line of text to the file tw.WriteLine(extractedText); // close the stream tw.Close();

Hi Subhash,

Thanks for your inquiry. I have tested your scenario with shared document using Aspose.Pdf for .NET 10.9.0 and managed to observe the reported exception. For further investigation, I have logged an issue in our issue tracking system as PDFNEWNET-39674 and also linked your request to it. We will keep you updated via this thread regarding the issue status.

We are sorry for the inconvenience caused.

<span style=“font-size:10.0pt;font-family:“Arial”,“sans-serif”;mso-fareast-font-family:
Calibri;color:#333333;mso-ansi-language:EN-US;mso-fareast-language:EN-US;
mso-bidi-language:AR-SA”>Best Regards,