Converting certain PDF to WORD errors with "com.aspose.pdf.internal.l103p.l0l: Overflow"

Hi
I am converting certain PDF documents to word documents using RecognitionMode.EnhancedFlow and it fails with the error:

com.aspose.pdf.internal.l103p.l0l: Overflow

The code snippet I use is:

Document doc = new Document("Aspose_support_internal_overflow.pdf");

com.aspose.pdf.DocSaveOptions docSaveOptions = new com.aspose.pdf.DocSaveOptions();
docSaveOptions.setFormat(com.aspose.pdf.DocSaveOptions.DocFormat.DocX);
/*
Locale.setDefault(Locale.ENGLISH);
docSaveOptions.setRelativeHorizontalProximity(2.5f);
docSaveOptions.setRecognizeBullets(false);

RepairOptions repairOptions = new RepairOptions();
doc.repair(repairOptions);
doc.optimizeResources(new OptimizationOptions());
for (Page page : doc.getPages()) {
// Access the image collection from each page’s resources
// The delete() method without parameters removes all images from the collection
page.getResources().getImages().delete();
}
*/
docSaveOptions.setMode(RecognitionMode.EnhancedFlow);
doc.save(output, docSaveOptions);

The commented part are different options I tried to get it not to fail.

Also, using RecognitionMode.Flow works and takes forever but the layout is not what I am looking for, I really need RecognitionMode.EnhancedFlow

Attached is a sample redacted PDF that is causing the issue.
Aspose_support_internal_overflow.pdf (1.5 MB)

Thanks

@brissonp

  1. The error com.aspose.pdf.internal.l103p.l0l: Overflow during PDF to Word conversion using RecognitionMode.EnhancedFlow may be related to an internal processing issue within the Aspose.PDF library when handling specific PDF structures . This type of error can occur when the library encounters complex or malformed content that exceeds expected limits during conversion .

  2. Switching to RecognitionMode.Flow avoids the overflow error but results in lower layout fidelity, indicating that EnhancedFlow is more sensitive to intricate PDF elements or formatting .

  3. There are precedents of stack-related errors occurring in Aspose.PDF during document conversions, such as when converting MHTML to PDF , which suggests that this could be a recurring issue with certain document complexities in the library’s internal processing logic.

Given these observations, it is recommended to report this issue to Aspose support with the sample PDF file for further investigation, as it may be a bug or limitation in the current version of the library.