Pdf's that crash on text extraction

programcsharp · March 2, 2006, 9:30am

I’ve been evaluating the text extraction capabilities of Aspose.Pdf.Kit and came across a bunch of Pdf’s that cause exceptions. I know a couple are because they are corrupt, or password protected, but most work fine in other programs or viewers. Can you take a look and get back to me ASAP, as I am evaluating this and several other text extraction components?

http://datarg.com/docs/badpdf.rar

Also, the performance doesn’t seem to be what it should be. Is that because of the text garbling, or is it standard?

GeorgieYuan · March 2, 2006, 10:20am

Dear programcsharp,

Thank you for considering Aspose.

I will download and test the pdf files and reply you soon.

forever · March 14, 2006, 6:09pm

Dear customer,

Sorry for reply to you late. Georgie is working on this issue and he will reply to you today or tomorrow.

GeorgieYuan · March 14, 2006, 9:49pm

Dear programcsharp,

Sorry for reply to you late. I have tested the pdf files and found this a bug. Fixing this bug will spend about 1 month.