Aspose.PDF Java Out of Memory on Merge of Large number of Documents


#1

When we split a document into individual pages of greater than a 1000 and attempt to merge those file back into a single multi page PDF document we get the following Out of Memory exception thrown:
Oct 28, 2019 10:43:06 AM org.junit.platform.launcher.core.DefaultLauncher handleThrowable
WARNING: TestEngine with ID ‘junit-jupiter’ failed to execute tests
java.lang.OutOfMemoryError: Java heap space
at java.lang.Throwable.fillInStackTrace(Native Method)
at java.lang.Throwable.(Throwable.java:88)
at java.lang.Throwable.(Throwable.java:99)
at java.lang.Error.(Error.java:70)
at java.lang.VirtualMachineError.(VirtualMachineError.java:53)
at java.lang.OutOfMemoryError.(OutOfMemoryError.java:58)
at java.lang.String.(String.java:673)
at java.lang.String.(String.java:608)
at com.aspose.pdf.internal.ms.System.l10l.lI(Unknown Source)
at com.aspose.pdf.internal.l8k.l0l.lf(Unknown Source)
at com.aspose.pdf.internal.l9j.lu.lI(Unknown Source)
at com.aspose.pdf.internal.l9j.ld.lI(Unknown Source)
at com.aspose.pdf.internal.l5y.l1l$lI.deserialize(Unknown Source)
at com.aspose.pdf.internal.l9u.le.deserialize(Unknown Source)
at com.aspose.pdf.internal.l5y.l1j$lI.deserialize(Unknown Source)
at com.aspose.pdf.internal.l9u.le.deserialize(Unknown Source)
at com.aspose.pdf.internal.l0k.lh.lI(Unknown Source)
at com.aspose.pdf.internal.l0k.lh.lI(Unknown Source)
at com.aspose.pdf.internal.l0k.lh.lI(Unknown Source)
at com.aspose.pdf.internal.l5y.l1j.l3y(Unknown Source)
at com.aspose.pdf.internal.l5y.l1j.l3v(Unknown Source)
at com.aspose.pdf.internal.l5y.l1j.l5l(Unknown Source)
at com.aspose.pdf.internal.l8k.l0v.lf(Unknown Source)
at com.aspose.pdf.internal.l0n.l0if.lt(Unknown Source)
at com.aspose.pdf.DocumentInfo.(Unknown Source)
at com.aspose.pdf.ADocument.l1p(Unknown Source)
at com.aspose.pdf.ADocument.lI(Unknown Source)
at com.aspose.pdf.ADocument.(Unknown Source)
at com.aspose.pdf.Document.(Unknown Source)
at com.aspose.pdf.facades.APdfFileEditor.lI(Unknown Source)
at com.aspose.pdf.facades.APdfFileEditor.concatenate(Unknown Source)
at com.aspose.pdf.facades.PdfFileEditor.concatenate(Unknown Source)
at com.epiq.discovery.pdf.utils.PDFUtils.pdfMergeImageImagePDFBox(PDFUtils.java:68)
at com.epiq.discovery.pdf.utils.PDFUtilsTest.pdfMergeImageImagePDFBox(PDFUtilsTest.java:69)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:675)
at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:125)

Below is a snipit of code that we are using to attempt to accomplish this task:
PdfFileEditor pdfEditor = new PdfFileEditor();
pdfEditor.setIncrementalUpdates(true);
pdfEditor.setConcatenationPacketSize(100);
pdfEditor.concatenate(singlePageFilePathList.toArray(new String[0]), outputFilePath);

Thanks in advance for any suggestions.


#2

@dmckinney

Would you kindly try using Aspose.PDF for Java 19.9 in your environment and also please try increasing Java heap size. In case you still face any issue, please share your sample PDF document along with the environment details i.e. OS Name and Version, JDK Version, Java Heap Size and Application Type with us. We will test the scenario in our environment and address it accordingly.


#3

Can you provide an SFTP site in order for us to upload some sample files?


#4

@aweech

You can please attach your sample files here with the post using Upload button. In case your files are larger, you may please upload it to some public file sharer e.g. Dropbox or Google Drive and share the link with us.


#5

The files are about 600mb. I am not able to use a public file sharer. Do you have any other way to send these files?


#6

@aweech

Would you please confirm if you are using latest version of the API. Aspose.PDF for Java 19.10 has just been released and we request you to try your scenario with that. Also, please concatenate PDF files using DOM approach which is recommended. While splitting the PDF documents, you can use Page.Dispose() method to free up captured memory.

In any case, if you are facing similar exception, we will be needing your complete code snippet along with sample PDF file.

We regret that we do not offer any other medium to share the files. You can upload 600mb data on Google Drive as it offers this space free. Please let us know about your feedback.


#7

I attempted to adjust the min and max heap allocated on my test runs and still received the same errors.I am including my samples I tested with and the project I used to test with.https://drive.google.com/file/d/1CG1Fl81eyvFOX82Fu88j8GSNAx4_omfI/view?usp=sharing


#8

@dmckinney

We are testing the scenario and will get back to you shortly.