Memory Usage / Garbage Collection questions

We run a cluster of servers which converts files on demand into G4 bitonal TIFF files. We use aspose.pdf PdfConverter.saveAsTIFF and then further process the tiff using aspose.imaging TiffImage.


What I am seeing is the first conversion on a newly provisioned server will trigger allocation of several objects which thrash the JVM heap and cause frequent young generation GC pauses. About every 9 seconds I see GC log statements that look like this:

[GC (Allocation Failure) 2016-03-07T20:03:23.235+0000: 2454.120: [ParNew: 4549818K->776458K(5033216K), 2.2212468 secs] 7511547K->3756040K(11324672K), 2.2213569 secs] [Times: user=2.39 sys=0.01, real=2.22 secs]

So every nine seconds the entire server has to pause for over 2 seconds of garbage collection. This lasts several minutes and sometimes hours before things seem to settle down and run smoother. But this behavior is really unacceptable on our servers because they are unable to service requests and are even occasionally terminated for failure to pass health checks.

1. What requires so much allocation? The ParNewGC events seem to be trimming ~4GB of memory from the new generation every 9 seconds. It doesn’t seem to matter the size/content of the PDF we submit for conversion, always the first one.

2. What recommendations can you provide to reduce/eliminate long GC pauses when using Aspose classes? (e.g. heap size, young generation size, etc).

3. I’ve read that Aspose does everything in memory (no streaming, no HDD usage). With that in mind if a customer submits a 400 page PDF and we try to rasterize that, what would be the minimum amount of memory we’d need to have on a server?

Thanks,
Adam
Hi Adam,

Thank you for your inquiry.

Please provide us a few samples that trigger the young generation GC pauses along with your code to recreate the situation. We will investigate the issue and update you accordingly. Along with above requested, please also share the details such as follow.

  • Operating system version
  • Operating system architecture
  • JDK type (Orcale, OpenJDK, IBM's JDK)
  • JDK version
  • Java Heap size (Min & Max)

For performance optimization, please visit the link Improved Performance with Customizable Cache for details.

Here is the code/algorithm we are using to convert the files:


PdfConverter converter = new PdfConverter();
converter.setResolution(new Resolution(300));
converter.bindPdf(sourceInputStream);
converter.doConvert();

converter.saveAsTIFF(tempFileOutputStream);

//… pass temp file to tiff handling

TiffImage tiffImage = (TiffImage)TiffImage.load(tempFileInputStream);

//adjust brightness
tiffImage.adjustBrightness(desiredBrightness);

//set the resolution and resize the image
tiffImage.setResolution(desiredDpi.getHorizontalResolution(), desiredDpi.getVerticalResolution());
tiffImage.resize((int)desiredWidth, (int)desiredHeight, ResizeType.LanczosResample);

//save as G4 tiff
TiffOptions options = new TiffOptions(TiffExpectedFormat.TiffCcittFax4);
options.setResolutionSettings(desiredDpi);
tiffImage.save(finalOutputStream, options);
I haven’t yet been able to pinpoint which exact step of this flow will trigger it although I suspect it might be in the PDF portion of the conversion.

I’ve attached a sample.pdf as one example we’ve been testing.

We are running a tomcat webapp on an Amazon EC2 c3.2xlarge Amazon Linux AMI 2015.09 with Oracle java 1.8.0_73. Heap is 12g total, young generation is 6g. But please keep in mind we experience this issue with other memory settings.

Thanks for the link about the caching we’ll investigate it.

-Adam
Hi Adam,

Thank you for the details and sample file.

We are looking into the issue. We will update you about our finds accordingly.

Ikram,

I have some new findings. I still don't understand the how and why of it, but we switched our deployment packaging from a .war webapp to a standalone spring-boot jar and the memory problems seem to have disappeared. I don't know if that will make any sense to your team or not.

Thanks,
Adam
Hi Adam,

Thank you for sharing more details with us. We will consider it while investigating the issue.

Hi Adam,

Thanks for the details. This will certainly help. Can you please also share your .war file? We were not able to reproduce the issue without it.

Best Regards,

I am unable to share the .war file - it won’t run outside of our environment. I can share pretty much any configuration if there is anything specific that would help.


As a side question, is that sample.pdf getting converted properly? In our server environment we end up with the situation I describe here: PdfConverter outputting blank page when saving certain PDF even though it converts fine on Mac OS.

We thought at first that maybe the problems were related since the release notes mention fixing PDFNEWJAVA-35444 and our server wasn’t using the newest version.

Thanks,
Adam

Hi Adam,

The issues you mentioned are related to Aspose.Pdf and concerned team is investigating these issues. You will be updated as soon as our investigation is complete.

Sorry for the inconvenience.

Best Regards,

Hi Adam,

Thanks for your patience. Please note Aspose.PDF processes files in memory so performance depends upon the system resources and size/contents of the input file.

In PDF to TIFF conversion memory consumption can be varied because of different compression type. You can use saveAsTiff() constructor with TiffSettings parameter for setting compression value.

TiffSettings tiffSettings1 = new TiffSettings();
tiffSettings1.setDepth(ColorDepth.Format8bpp);
tiffSettings1.setCompression(CompressionType.CCITT4);

PdfConverter converter = new PdfConverter();
converter.bindPdf("Input.pdf");
converter.doConvert();
converter.saveAsTIFF("PDFtoTIFF.TIFF", 600, 800, tiffSettings1);

Furthermore, You can use MemoryCleaner object to clean the memory. After completing operations with Aspose.Pdf object, you can close object with close() or dispose() methods and finally use com.aspose.pdf.MemoryCleaner.clear() method. It clears Aspose.Pdf specific instances and hopefully will enables you to effective memory usage.

Please note it is recommended that you should call this method only if there is a shortage of available memory. Please find sample code to check memory status.

Runtime rt = Runtime.getRuntime();
long max = rt.maxMemory()/1048576;
long total = rt.totalMemory()/1048576;
long free = rt.freeMemory()/1048576;
long used = total - free;

Best Regards,