IllegalArgumentException trying to extract text from some PDF files

The extractText() method in the PDFExtractor class of pdf-kit generates an IllegalArgumentException deep in the Aspose library when reading some PDF files. It does not fail with all files. I’m using aspose-pdf-kit-4.4.0-java. It was part of the Aspose total package that I downloaded today, February 13, 2013. The jar file has a mod date of September 24, 2012.



I’ve created a simple java program that lets you open PDF files and try to extract their text. This program and three different PDF files that fail are in the attached zip file. A sample stack trace from the failure follows:



java.lang.IllegalArgumentException: Comparison method violates its general contract!

at java.util.ComparableTimSort.mergeHi(ComparableTimSort.java:835)

at java.util.ComparableTimSort.mergeAt(ComparableTimSort.java:453)

at java.util.ComparableTimSort.mergeForceCollapse(ComparableTimSort.java:392)

at java.util.ComparableTimSort.sort(ComparableTimSort.java:191)

at java.util.ComparableTimSort.sort(ComparableTimSort.java:146)

at java.util.Arrays.sort(Arrays.java:472)

at java.util.Collections.sort(Collections.java:155)

at com.aspose.pdf.kit.iw.o(Unknown Source)

at com.aspose.pdf.kit.iw.k(Unknown Source)

at com.aspose.pdf.kit.iw.a(Unknown Source)

at com.aspose.pdf.kit.iw.a(Unknown Source)

at com.aspose.pdf.kit.iw.a(Unknown Source)

at com.aspose.pdf.kit.db.a(Unknown Source)

at com.aspose.pdf.kit.PdfExtractor.extractText(Unknown Source)

at PDFKitTest$safeMain.run(PDFKitTest.java:61)

at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:251)

at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:721)

at java.awt.EventQueue.access$200(EventQueue.java:103)

at java.awt.EventQueue$3.run(EventQueue.java:682)

at java.awt.EventQueue$3.run(EventQueue.java:680)

at java.security.AccessController.doPrivileged(Native Method)

at java.security.ProtectionDomain$1.doIntersectionPrivilege(ProtectionDomain.java:76)

at java.awt.EventQueue.dispatchEvent(EventQueue.java:691)

at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)

at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)

at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)

at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)

at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)

at java.awt.EventDispatchThread.run(EventDispatchThread.java:91)





Thanks in advance for your help in fixing or developing a workaround for this problem.



Best regards,



Steven Glass

Hi Steven,


Thanks for reporting issue. I have managed to reproduce the issue at my end and logged it as PDFKITJAVA-33283 in our bug tracking system for further investigation and resolution. You will be notified via this thread as soon as it is resolved.


Sorry for the inconvenience faced.

Best Regards,

Hi Steven,


Thanks for your patience. We have a good news for you. Your reported issue has been resolved and fix will be included in upcoming release of Aspose.Pdf.Kit for java 4.5.0, planned in March,2013.

Best Regards,

The issues you have found earlier (filed as PDFKITJAVA-33283) have been fixed in Aspose.Pdf.Kit for Java 4.5.0.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.