Extract text from PDF document using Apose.PDF for .NET - TextAbsorber No Longer Thread Safe

cashcache1 · February 3, 2015, 6:03pm

We have identified another threading problem in the latest release (10.0.0). When extracting text from PDF documents using multiple threads, the CPU will spike to 100% randomly and never finish. In most cases this happens when multiple threads are extracting text from documents with many pages. Please note that version 9.7.0 works fine in multi-threaded environments.

The problem appears to happen either when adding the TextAbsorber to the Page object, or when retrieving the text from the TextAbsorber.Text property.

I have added a sample application that will demonstrate the problem. I have also include many PDFs in the working folder. 9 times out of 10, CPU usage will go to 100% (and stay there) before finishing the test when not using a lock. If a lock is used, the test will run fine (but very slowly). The code is commented with more information.

Please note, you will need to supply a valid license. It will not exhibit the problem with the trial version.

Thank you,

Scott

tilal.ahmad · February 5, 2015, 8:11am

Hi Scott,

Thanks for your inquiry. We have tested your shared sample code and noticed the 100% CPU usage issue with Aspose.Pdf for .NET 10.0.0. We have logged a regression ticket PDFNEWNET-38168 with highest priority in our tracking system for further investigation and resolution. We will notify you as soon as it is resolved.

We are sorry for the inconvenience caused.

Best Regards,

aspose.notifier · March 6, 2015, 12:36pm

The issues you have found earlier (filed as PDFNEWNET-38168) have been fixed in Aspose.Pdf for .NET 10.2.0.

This message was posted using Notification2Forum from Downloads module by Aspose Notifier.