We have a heap size issue when we merge pdfs into a single consolidated file which is a pdf. The heap size limit for the server we are using is 1.5 Gbs.
We are currently using input outpot stream to perform the pdf merger. Here is the snippet of the code in JAVA/Groovy:
PDFMergerUtility mergePdf = new PDFMergerUtility()
filesList?.each {
def byteArrayOutputStream
try {
def s3object = templateAmazonService.downloadFromS3(it?.filePath,it?.fileName, messageData?.bucketName)
if (s3object) {
InputStream s3Is = s3object?.objectContent;
byteArrayOutputStream = createByteArrayInputStream(s3Is);
s3object?.close();
InputStream is0 = new ByteArrayInputStream(byteArrayOutputStream.toByteArray());
if (is0)
mergePdf.addSource(is0)
}
byteArrayOutputStream?.close()
byteArrayOutputStream = null
}
catch (Exception ex) {
log.error(“Error occurred while adding file to mergePDF” + ex)
}
}
mergePdf.mergeDocuments();
The amount of pdfs we are merging which causes the heap size is more than 3000. We are fetching pdf records from the S3 and using its stream to merege the pdf. The conslidation in large number of files (more than 3000) is causing issue here.
We are also looking for another option to use Aspose's Document to check and see if it manages memory to reolve the heap size issue . Here is the snippet of the code:
Document finalPDF = new Document()
File[] files = file.listFiles()
Document finalOutput = new Document()
for (final File fileEntry : files) {
Workbook workbook = new Workbook(fileEntry.getAbsolutePath())
def fileName = fileEntry.getName()
ByteArrayOutputStream dstStream = new ByteArrayOutputStream();
workbook.save(dstStream, SaveFormat.PDF);
ByteArrayInputStream srcStream = new ByteArrayInputStream(dstStream.toByteArray());
Document tempDocument = new Document(srcStream)
finalPDF.getPages().add(tempDocument.getPages())
}
finalOutput.save(bOutputStream, com.aspose.pdf.SaveFormat.Pdf);
return bOutputStream.toByteArray()
The aspose version we are using is 19.5. We are using workbook to save the pdfs and finally merge it after the loop.
Is there another approach using the Aspose such that memory management is done in an effecient way and no heap size issues occurs?