OutOfMemory when processing some pdf

I am using PDF.Java product and getting an OutOfMemory exception when starting to process some PDF’s (example is attached)

aspose pdf lib version: 18.7
target platform x64 (amd64)
os: Linux vm-6373529f 4.15.0-33-generic #36~16.04.1-Ubuntu SMP Wed Aug 15 17:21:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
java:
openjdk version “1.8.0_181”
OpenJDK Runtime Environment (build 1.8.0_181-8u181-b13-0ubuntu0.16.04.1-b13)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

@EvilLord666

Thank you for contacting support.

Would you please share a narrowed down code snippet reproducing this issue, along with source PDF files so that we may try to reproduce and investigate it in our environment. We can not find any attachment with your post, you may share requested data via Google Drive, Dropbox etc.

This unit test reproduces the issue:

@org.junit.Test
public void checkOOMException() {
com.aspose.pdf.Document document = null;
try {
document = new com.aspose.pdf.Document(“ЭНЦИКЛОПЕДИЧЕСКИЙСЛОВАРЬМЕДИЦИНСКИХТЕРМИНОВ.pdf”);
com.aspose.pdf.ImagePlacementAbsorber abs = new com.aspose.pdf.ImagePlacementAbsorber();
document.getPages().accept(abs);
for (com.aspose.pdf.ImagePlacement imagePlacement : (Iterable<com.aspose.pdf.ImagePlacement>)abs.getImagePlacements()) {
System.out.println(“image width:” + imagePlacement.getRectangle().getWidth());
System.out.println(“image height:” + imagePlacement.getRectangle().getHeight());
System.out.println(“image LLX:” + imagePlacement.getRectangle().getLLX());
System.out.println(“image LLY:” + imagePlacement.getRectangle().getLLY());
System.out.println(“image horizontal resolution:” + imagePlacement.getResolution().getX());
System.out.println(“image vertical resolution:” + imagePlacement.getResolution().getY());
}
}
finally {
if (document != null)
document.close();
}
}

@EvilLord666

Thank you for sharing the code snippet.

This does not reproduce the issue with every document so the problem is probably file specific. Would you please share sample PDF file so that we may try to reproduce and investigate it in our environment.

This is the document which lead to Out of Memory Error

@EvilLord666

Thank you for sharing requested data.

We have worked with the data shared by you and have been able to reproduce the issue in our environment. A ticket with ID PDFJAVA-38016 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

@EvilLord666

Thank you for being patient.

We have found your code snippet to work without throwing OutOfMemory exception if set heap memory is over 2 GB. For example, If we run this code with parameter -Xmx2400Mb then we get the result without any exception.

Moreover, using page by page processing requires much less memory. Following code snippet will be executed successfully even with parameter -Xmx515Mb.

try {
        document = new com.aspose.pdf.Document(getInputPdf());
        com.aspose.pdf.ImagePlacementAbsorber abs = new com.aspose.pdf.ImagePlacementAbsorber();
//      document.getPages().accept(abs);
        for (Page page : document.getPages()) {
            page.accept(abs);
            for (com.aspose.pdf.ImagePlacement imagePlacement : (Iterable<com.aspose.pdf.ImagePlacement>) abs.getImagePlacements()) {
                System.out.println("image width:" + imagePlacement.getRectangle().getWidth());
                System.out.println("image height:" + imagePlacement.getRectangle().getHeight());
                System.out.println("image LLX:" + imagePlacement.getRectangle().getLLX());
                System.out.println("image LLY:" + imagePlacement.getRectangle().getLLY());
                System.out.println("image horizontal resolution:" + imagePlacement.getResolution().getX());
                System.out.println("image vertical resolution:" + imagePlacement.getResolution().getY());
            }
            page.dispose();
            abs = new com.aspose.pdf.ImagePlacementAbsorber();
        }
    }
    finally {
        if (document != null)
            document.close();
    }

Please note that after invoking the method page.dispose you cannot work with the page object. We hope this will be helpful. Please feel free to contact us if you need any further assistance.

Thaks @Farhan.Raza, this solution works, however if i use it performance is slowing down

@EvilLord666

Thank you for your kind feedback.

We are glad to know that suggested approach works in your environment. Lessening available memory can slow down program execution so please allocate sufficient memory to balance the resources.