Create Searchable PDF using Aspose.PDF for Java - HOCR Callback method produces exception

kpboerema · April 8, 2020, 12:03pm

We are using the HOCR Callback method to OCR images and add the resulted text into the PDF.
This however results in a stacktrace:
Exception in thread “main” class com.aspose.pdf.internal.ms.System.lh: Invalid index: index should be in the range [1…n] where n equals to the operators count.
com.aspose.pdf.OperatorCollection.lI(Unknown Source)
com.aspose.pdf.OperatorCollection.get_Item(Unknown Source)
com.aspose.pdf.internal.l8f.l0v.lI(Unknown Source)
com.aspose.pdf.ADocument.convert(Unknown Source)
com.aspose.pdf.Document.convert(Unknown Source)
com.rabobank.testpdf.CheckForLibor.checkForLibor(CheckForLibor.java:56)
com.rabobank.testpdf.CheckForLibor.main(CheckForLibor.java:30)
at com.aspose.pdf.OperatorCollection.lI(Unknown Source)
at com.aspose.pdf.OperatorCollection.get_Item(Unknown Source)
at com.aspose.pdf.internal.l8f.l0v.lI(Unknown Source)
at com.aspose.pdf.ADocument.convert(Unknown Source)
at com.aspose.pdf.Document.convert(Unknown Source)
at com.rabobank.testpdf.CheckForLibor.checkForLibor(CheckForLibor.java:56)
at com.rabobank.testpdf.CheckForLibor.main(CheckForLibor.java:30)

THe HOCR String that returns from the OCR callback is: (created by Tesseract OCR module)

\W/z WY HERBERT yyw FREEHILLS IWS

Any ideas why this is happening?

Adnan.Ahmad · April 8, 2020, 10:36pm

@kpboerema,

Thank you for contacting support.

Can you please share sample project along with source files to further investigate this issue.

kpboerema · April 9, 2020, 9:06am

See attached code and sample PDF. error.zip (847.7 KB)

Adnan.Ahmad · April 9, 2020, 2:10pm

@kpboerema,

I have observed issue you mentioned and have logged it as PDFJAVA-39319 in our issue tracking system. We will further look into details of the issue and keep you posted with the status of its correction. Please be patient and spare us little time.

We are sorry for the inconvenience.

kpboerema · April 22, 2020, 5:33pm

Hello,
Any idea when we can expect a fix?
It is currently blocking for our project.

Rgds,
Klaas Pieter

Adnan.Ahmad · April 22, 2020, 8:45pm

@kpboerema,

I like to inform this issue has been added recently in our issue tracking system and as per our company policy, the first priority for investigation is given to the Paid Support i.e. Enterprise and Priority Support on first come first serve basis. After that the issues from normal support forum are scheduled for investigation on first come first serve basis. I request for your patience and we will share good news with you soon.

aspose.notifier · June 23, 2020, 7:38pm

The issues you have found earlier (filed as PDFJAVA-39319) have been fixed in Aspose.PDF for Java 20.6.

kpboerema · June 28, 2020, 2:25pm

thanks…I verified and it works.!