Document.validate() throws Out Of Memory error

Hi,

We are using Aspose.PDF library for Java to convert PDF files to PDF/A-1b format. We have already processed thousands of documents up to 2GB in size. However, we got stuck on one particular document, which doesn’t work despite having a size of only 100MB. When calling the Document.validate() on it (see code snippet below), it periodically allocates more and more RAM, until it runs out of memory (see stack trace below). We have also updated to the latest version of the library (20.7.) but the same error occurs.

Code snippet:

		Document document = new Document(filePath);

		if (document.getPdfFormat() == PdfFormat.PDF_A_1B) {
			LOGGER.trace("Document already in PDF/A-1b.");
		} else {
			LOGGER.trace("Converting document to PDF/A-1b.");
			document.validate(task.getFileInDocDir("validation_log.xml").getPath(), PdfFormat.PDF_A_1B);
			document.convert(task.getFileInDocDir("conversion_log.xml").getPath(), PdfFormat.PDF_A_1B, ConvertErrorAction.Delete);
			document.save(filePath);
		}
		document.close();

Error stack trace:

java.lang.OutOfMemoryError: GC overhead limit exceeded
at com.aspose.pdf.internal.ms.System.Collections.Generic.lf.lf(Unknown Source)
at com.aspose.pdf.internal.ms.System.Collections.Generic.lf.lI(Unknown Source)
at com.aspose.pdf.internal.ms.System.Collections.Generic.lf.set_Item(Unknown Source)
at com.aspose.pdf.internal.l4l.l0n.lf(Unknown Source)
at com.aspose.pdf.internal.l4l.l0n.a_(Unknown Source)
at com.aspose.pdf.internal.l4t.lj.lf(Unknown Source)
at com.aspose.pdf.internal.l4t.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l4l.lv.lI(Unknown Source)
at com.aspose.pdf.internal.l4l.lh.lI(Unknown Source)
at com.aspose.pdf.internal.l4y.l1n.lb(Unknown Source)
at com.aspose.pdf.internal.l4j.lk.ld(Unknown Source)
at com.aspose.pdf.internal.l4j.lI.ld(Unknown Source)
at com.aspose.pdf.internal.l4j.lk.(Unknown Source)
at com.aspose.pdf.internal.l4j.lI.(Unknown Source)
at com.aspose.pdf.internal.l4j.lf.(Unknown Source)
at com.aspose.pdf.internal.l4p.lk.lI(Unknown Source)
at com.aspose.pdf.internal.l4y.l1h.l0if(Unknown Source)
at com.aspose.pdf.internal.l5if.ld.lj(Unknown Source)
at com.aspose.pdf.internal.l5if.lv.lb(Unknown Source)
at com.aspose.pdf.internal.l5if.lv.(Unknown Source)
at com.aspose.pdf.internal.l5if.ld.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.lb(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.lu(Unknown Source)
at com.aspose.pdf.internal.l5if.ly.ly(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.lI(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.le(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.(Unknown Source)
at com.aspose.pdf.internal.l5if.l0t.(Unknown Source)
at com.aspose.pdf.internal.l8k.ly.lI(Unknown Source)

Thank you for your reply.

Marek

@ixtent

Would you please upload your PDF to Dropbox or Google Drive and share the link with us. We will test the scenario in our environment and address it accordingly.

Hi, @asad.ali

Thank you for your reply. I can share the document using OneDrive. Since it is our customer’s document, we were granted the permission to share it for this specific use only. Could you provide an email adress so we can share it only for that email adress?

Regards,
Marek

@ixtent

You may please share the link in a private message. Please click over username and press Blue Message button to send a private message.

@ixtent

We have tested the scenario in our environment using Aspose.PDF for Java 20.7 and were able to notice that API hung up at the process. Please note that Document.Validate() method should be executed after converting PDF to PDF/A. We tested the scenario in this way and noitced that process hung up on Convert() method as well.

We have logged an issue as PDFJAVA-39696 in our issue tracking system for the sake of correction. We will investigate it in details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

PS: Would you kindly try increasing the Java Heap Size and let us know if it resolves the issue. Also, please share your current heap size.

Hi @asad.ali,

Thanks for the information.

The code we use is based on the official example from github, please see method pdfTopdfA1bConversion() at https://github.com/aspose-pdf/Aspose.PDF-for-Java/blob/master/Examples/src/main/java/com/aspose/pdf/examples/AsposePdfExamples/DocumentConversion/ConvertPDFToPDFAFormat.java

We already tried to increase the heap size up to 8GB, but the behavior is independent of the heap size. It just periodically allocates more and more memory, until it hangs. Validation of 100MB file probably shouldn’t use more than 8GB of RAM anyway.

Regards,
Marek

@ixtent

It is alright to use it before the conversion but, in that case it will return false as the document is not converted to PDF/A yet.

Thanks for sharing the information.

We have updated the details of the logged ticket and will inform you as soon as we have additional updates regarding its fix.

Hi @asad.ali

Is there any progress on the issue?

Regards,
Marek

@ixtent

The earlier logged ticket is not yet resolved sadly and it is currently being investigated. We will surely post updates within this forum thread as soon as the ticket is resolved. Please give us some time.

We are sorry for the inconvenience.

The issues you have found earlier (filed as PDFJAVA-39696) have been fixed in Aspose.PDF for Java 21.2.