**IMPACT PRODUCTION SERVER** - Aspose.PDF- High Memory Consumption

Hi Support,

We are using Aspose.PDF to convert PDFs to HTML in one of our main applications. However, we are experiencing extremely high memory usage on our server, with consumption reaching 95%.

We have tried all available memory cleanup methods provided by Aspose, but none have been effective. Additionally, we tested with the latest version of Aspose.PDF, but the issue persists.

Could you please provide guidance on resolving this issue as a priority? Your assistance would be greatly appreciated.

Snapshot of memory on local system -
image.png (70.3 KB)

Reference Code:-

try {
            long start = System.currentTimeMillis();

            // Initialize document object to load the source PDF
            document = new Document(pdfFile.getAbsolutePath());

            // Set HTML save options
            HtmlSaveOptions htmlSaveOptions = new HtmlSaveOptions();
            htmlSaveOptions.setPartsEmbeddingMode(HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml);
            htmlSaveOptions.setRasterImagesSavingMode(
                    HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground);
            // Use ByteArrayOutputStream to capture the output HTML as a string
            byteArrayOutputStream = new ByteArrayOutputStream();
                    //    document.save(byteArrayOutputStream, htmlSaveOptions);
            document.save( byteArrayOutputStream ,htmlSaveOptions);
            // Convert the byte array to a string
            htmlContent = byteArrayOutputStream.toString("UTF-8");
            count++;
            // Log the success of the conversion
            System.out.println(count + "Successfully converted: " + pdfFile.getName() + "\tSize:" + htmlContent.length()
                    + "\tTime " + (System.currentTimeMillis() - start));

            if(count%100==0)
            {
                MemoryCleaner.clearStaticInstances();
                MemoryCleaner.clear();
                MemoryCleaner.clearKeepingFontSettings();
            }                
        } catch (Exception e) {
            e.printStackTrace(); // Log the exception if necessary
        }

Below is the license details:-
Apose License - 20240822
Apose Version - 24-7

@rchilli

To address the high memory consumption issue you’re experiencing while converting PDFs to HTML using Aspose.PDF, here are several strategies you can implement:

  1. Memory Management: Ensure that you are effectively using the MemoryCleaner class to manage memory. You can call methods like clear(), clearStaticInstances(), and clearKeepingFontSettings() to free up memory after processing a batch of documents. It seems you are already doing this every 100 conversions, which is good practice.

  2. Use Memory Extender: Consider enabling the MemoryExtender class, which allows the use of disk space as temporary swap memory. This can help manage large files without exhausting the heap memory. You can enable it by calling:

    MemoryExtender.setSwapEnabled(true);
    
  3. Skip Heavy Content: If your PDFs contain elements that consume a lot of memory, you can enable the option to skip these heavy content elements during rendering. This can be done by:

    MemoryExtender.setSkipHeavyContentEnabled(true);
    
  4. Optimized Memory Stream: For handling large documents, you can enable the use of OptimizedMemoryStream as the default memory storage. This is particularly useful for documents larger than 2 GB:

    MemoryExtender.setOptimizedMemoryStreamByDefault(true);
    
  5. Element Rendering Timeout: You can set a timeout for rendering single elements, which can help in managing memory usage during the conversion process:

    MemoryExtender.setElementRenderingTimeout(10000); // Set to 10 seconds
    
  6. Batch Processing: If possible, process your documents in smaller batches and clear memory between batches to prevent high memory usage.

  7. Monitor Memory Usage: Keep an eye on the memory usage during the conversion process to identify any specific documents or elements that may be causing spikes in memory consumption.

By implementing these strategies, you should be able to mitigate the high memory usage issue on your production server. If the problem persists, consider reaching out to Aspose support for further assistance or to report a potential memory leak in the library.

For more detailed information on memory management, you can refer to the MemoryCleaner and MemoryExtender documentation.

If you have any further questions or need additional assistance, feel free to ask!

Tried all off these, but still facing issues. Memory is not being released.

Can you please provide the best possible solution?

@rchilli

Would you kindly try using 25.3 version of the API as it is the latest one and has memory usage improvements in it. If issue still persists with the latest version, please share one of the sample PDF documents for our reference along with the screenshot of the memory consumption during execution. We will log an investigation ticket and share the ID with you.

We tried with this version as well, still facing issues
Also, concern is not specific to document its for all types of PDF document

Major Issue - memory is not being released

@rchilli

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-59657

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.