Aspose PDF KIT for JAVA - PDFFIleEditor. concate or append

Hi ,

We are using Aspose PDF Kit for java to either concat mulltiple PDF files (as an arraylist) or append multiple PDFs together to form one single output PDF file

We have a requirement where in we get PDF files as inputstreams and we need to have all this either appended or concatenated to a final output PDF

With PdfFileEditor.concate we have a heap error when the input PDF array (of streams) exceed a certain number of files

As a workaround, we are trying to append to a single file. for example,

In a for loop, we have a code like

For(int i = 0; i < number of PDF files; i++)

{

pdfFileEditor.append(target file name, sourcefile[i], 1, count)

}

note that the target file is the same for all append operations of the individual pdf files. We get the following error message

java.lang.ClassCastException: com.aspose.pdf.kit.md

Request quick turnaround

Thanks,

Raji

Hi,

I am using the Aspose PDF license to generate the a single PDF from the multiple PDFs and i am getting memory leak error. This needs to be resolved immediately as we have a go live in next week.

My license details.

Product<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Pricing Plan

SKU

Yr(s)

Quantity

Aspose.Total for Java

Developer Small Business

APJVTODE

1

1

Thanks,

Raji R

Hi Raji,

First of all, please make sure that you have assigned enough heap space to JVM.

Secondly, please download the latest version of Aspose.Pdf.Kit for Java and try at your end.

If you still find the same problem then please share the input PDF files along with the code snippet which can help us reproduce the same issue at our end. Moreover, please share the details regarding your working environment like OS, JDK version etc. We’ll further investigate the issue and guide you accordingly.

We’re sorry for the inconvenience.
Regards,

Hi,

This is pretty urgent and appreciate a quick response. You can reach me on my cell +91-9920014033 in case you have any questions or seek more clarification.

Per your instruction checked with the latest jar and we are not seeing any change.

Pasting the source code.

package com.dms.test;

import java.awt.image.BufferedImage;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

import org.apache.commons.io.IOUtils;

import aspose.pdf.Bookmark;
import aspose.pdf.Bookmarks;
import aspose.pdf.Pdf;
import aspose.pdf.Section;
import aspose.pdf.Text;

import com.aspose.pdf.kit.PdfContentEditor;
import com.aspose.pdf.kit.PdfFileEditor;
import com.aspose.pdf.kit.PdfFileInfo;
import com.lowagie.text.Document;
import com.lowagie.text.pdf.BaseFont;
import com.lowagie.text.pdf.PdfContentByte;
import com.lowagie.text.pdf.PdfImportedPage;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfWriter;
import com.lowagie.tools.concat_pdf;

public class TestBookMarks {

public void init() {
System.out.println("Loading Aspose License");
aspose.pdf.License licAsposePDF = new aspose.pdf.License();
com.aspose.words.License licAsposeWords = new com.aspose.words.License();
com.aspose.pdf.kit.License licAsposePDFKit = new com.aspose.pdf.kit.License();
try {
licAsposePDF.setLicense(new FileInputStream(new File(
"/4.Workspace/TIFFGeneration/src/Aspose.Total.Java.lic")));
licAsposeWords.setLicense(new FileInputStream(new File(
"/4.Workspace/TIFFGeneration/src/Aspose.Total.Java.lic")));
licAsposePDFKit.setLicense(new FileInputStream(new File(
"/4.Workspace/TIFFGeneration/src/Aspose.Total.Java.lic")));

} catch (Exception e)
{
System.out.println(e.getMessage());
}
}
public static void main(String args[]) {

System.out.println("Loading Aspose License");
aspose.pdf.License licAsposePDF = new aspose.pdf.License();
com.aspose.words.License licAsposeWords = new com.aspose.words.License();
com.aspose.pdf.kit.License licAsposePDFKit = new com.aspose.pdf.kit.License();
try {
licAsposePDF.setLicense(new FileInputStream(new File(
"/4.Workspace/TIFFGeneration/src/Aspose.Total.Java.lic")));
licAsposeWords.setLicense(new FileInputStream(new File(
"/4.Workspace/TIFFGeneration/src/Aspose.Total.Java.lic")));
licAsposePDFKit.setLicense(new FileInputStream(new File(
"/4.Workspace/TIFFGeneration/src/Aspose.Total.Java.lic")));

} catch (Exception e)
{
System.out.println(e.getMessage());
}

// Bind target pdf document .
try {
String rootPath = "E:\\Projects\\DMS\\PDF_Input\\";

File dir = new File(rootPath);
String[] chld = dir.list();

if (chld == null) {
System.out
.println("Specified directory does not exist or is not a directory.");
System.exit(0);
} else {
InputStream[] inputStreams = new InputStream[chld.length +1 ];
//ArrayList inputPDFArray = new ArrayList();
int countForSummary = 0;
for (int i = 0; i < chld.length; i++) {
// New a object of Class PdfContentEditor.
PdfContentEditor editor = new PdfContentEditor();
String fileName = chld[i];
System.out.println("File Count --- File name : " + i + "---" + rootPath + fileName);
if (fileName.indexOf("pdf") > 0) {
FileInputStream fileStreamInput = new FileInputStream(rootPath
+ fileName);

PdfFileInfo pdfFileInfo = new PdfFileInfo(fileStreamInput);
int count = pdfFileInfo.getNumberofPages();
countForSummary = countForSummary + count;
fileStreamInput.close();

fileStreamInput = new FileInputStream(rootPath
+ fileName);
editor.bindPdf(fileStreamInput);
editor.deleteBookmarks();
fileStreamInput.close();
for (int k = 0; k < count; k++) {
TestBookMarks tb = new TestBookMarks();
// StringBuffer[] sb = tb.fetchBookMarkDataSingleLine(k);
// editor.createBookmarkOfPage(sb[0].toString() + ";" + sb[1].toString(), k+1);

com.aspose.pdf.kit.Bookmark parentBM = tb.fetchBookMarkDataNested(i, k);
editor.createBookmarks(parentBM);
}
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
//FileOutputStream fs = new FileOutputStream(rootPath + fileName);
editor.save(byteArrayOutputStream);
editor.close();
//editor.save(fs);
inputStreams[i+1] = new ByteArrayInputStream(byteArrayOutputStream.toByteArray());
// inputPDFArray.add(new ByteArrayInputStream(byteArrayOutputStream.toByteArray()));
byteArrayOutputStream.close();
}
}

System.out.println("Out of For loop Safely" +inputStreams.length);
PdfFileEditor pdfFileEditor = new PdfFileEditor();
ByteArrayInputStream summarySheet = generatePDFSummary(countForSummary);
inputStreams[0]= summarySheet;
summarySheet.close();
pdfFileEditor.concatenate(inputStreams, new FileOutputStream("E:\\Projects\\DMS\\PDF_Merged\\Rajesh_1.pdf"));

}

} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

public com.aspose.pdf.kit.Bookmark fetchBookMarkDataNested(int i, int k )
{
com.aspose.pdf.kit.Bookmark parentBM = new com.aspose.pdf.kit.Bookmark();
parentBM.setPageNumber(k+1);
StringBuffer[] sb =fetchBookMarkDataSingleLine(parentBM.getPageNumber());
parentBM.setTitle("Page Number : " + parentBM.getPageNumber());

com.aspose.pdf.kit.Bookmark childBM1 = new com.aspose.pdf.kit.Bookmark();
childBM1.setPageNumber(k+1);
childBM1.setTitle(sb[0].toString());

com.aspose.pdf.kit.Bookmark childBM2 = new com.aspose.pdf.kit.Bookmark();
childBM2.setPageNumber(k+1);
childBM2.setTitle(sb[1].toString());

Bookmarks allChildBM = new Bookmarks();
allChildBM.add(childBM1);
allChildBM.add(childBM2);

parentBM.setChildItem(allChildBM);

return parentBM;
}
public StringBuffer[] fetchBookMarkDataSingleLine(int i)
{

StringBuffer[] sbArray = new StringBuffer[2];
StringBuffer sb = new StringBuffer();
StringBuffer bl = new StringBuffer();

sb.append("NOP_Versand");
sb.append(";");
sb.append("00000000" + i);
sb.append(";");
sb.append("0001");
sb.append(";");
sb.append("CIF5501400");
sb.append(";");
sb.append("00000000000" + i);
sb.append(";");
sb.append("001");
sb.append(";");
sb.append("001");
sb.append(";");
sb.append("00");
sb.append(";");
sb.append("00");
sb.append(";");
sb.append("0");
sb.append(";");
sb.append("0");
sb.append(";");
sb.append("0");
sb.append(";");
sb.append("000000000000");
sb.append(";");
sb.append("000000000000");
sb.append(";");
sb.append("000000000000");
sb.append(";");
sb.append("GEMB_DMS_0001_ " +i);
sb.append(";");
sb.append("IR21" +i);
sb.append(";");
sb.append("GE_20.117 ");
sb.append(";");
sb.append("0");
sb.append(";");
sb.append("F");

bl.append("NOP_Beliage");
bl.append(";");
bl.append("00000000");

sbArray[0]=sb;
sbArray[1]=bl;


// System.out.println("Bookmark 1 -- " +sb.toString());
// System.out.println("Bookmark 2 -- " +bl.toString());
return sbArray;
}

private static ByteArrayInputStream generatePDFSummary(int count)
{
ByteArrayInputStream bi =null;
try
{

// bi = new ByteArrayInputStream(IOUtils.toByteArray(new FileInputStream(new File("D:\\4.Workspace\\TIFFGeneration_old\\resources\\pdf-merged\\SheetStaistic_0.pdf"))));

//Instantiate a Pdf object by calling its empty constructor
Pdf pdf = new Pdf();
//add a section
Section sec1 = pdf.getSections().add();

// create text object to be added to main document section
Text text = new Text("AFP SUMMARY PAGE ...." + '\n' + "Page Count : " +count);



// add text object to paragraphs collection of section
sec1.getParagraphs().add(text);
// specify the font color information for segment object
text.getTextInfo().setColor(new aspose.pdf.Color("Black"));
// specify the BackGround color information for segment object
//text.getTextInfo().setBackgroundColor(new aspose.pdf.Color("Blue"));
// specify the font name for text object
text.getTextInfo().setFontName("GE Inspira");
// set the font size information
text.getTextInfo().setFontSize(11);

ByteArrayOutputStream summStream = new ByteArrayOutputStream();
//Save the resultant PDF
pdf.save(summStream);

bi = new ByteArrayInputStream(summStream.toByteArray());

}catch (Exception excep)
{
System.out.println(excep.getMessage());
}
return bi;

}


private static void concatenate(ArrayList streamOfPDFFiles, FileOutputStream outputStream, boolean paginate)
{
Document document = new Document();
try {
List pdfs = streamOfPDFFiles;
List readers = new ArrayList();
int totalPages = 0;
Iterator iteratorPDFs = pdfs.iterator();

// Create Readers for the pdfs.
while (iteratorPDFs.hasNext()) {
InputStream pdf = iteratorPDFs.next();
iteratorPDFs.remove();
PdfReader pdfReader = new PdfReader(pdf);
readers.add(pdfReader);
totalPages += pdfReader.getNumberOfPages();
pdfReader.close();
pdf.close();
}
// Create a writer for the outputstream
PdfWriter writer = PdfWriter.getInstance(document, outputStream);

document.open();
BaseFont bf = BaseFont.createFont(BaseFont.HELVETICA,
BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
PdfContentByte cb = writer.getDirectContent(); // Holds the PDF

// data

PdfImportedPage page;
int currentPageNumber = 0;
int pageOfCurrentReaderPDF = 0;
Iterator iteratorPDFReader = readers.iterator();

// Loop through the PDF files and add to the output.
while (iteratorPDFReader.hasNext()) {
PdfReader pdfReader = iteratorPDFReader.next();

// Create a new page in the target for each source page.
while (pageOfCurrentReaderPDF < pdfReader.getNumberOfPages()) {
document.newPage();
pageOfCurrentReaderPDF++;
currentPageNumber++;
page = writer.getImportedPage(pdfReader,
pageOfCurrentReaderPDF);
cb.addTemplate(page, 0, 0);

// Code for pagination.
if (paginate) {
cb.beginText();
cb.setFontAndSize(bf, 9);
cb.showTextAligned(PdfContentByte.ALIGN_CENTER, ""
+ currentPageNumber + " of " + totalPages, 520,
5, 0);
cb.endText();
}
}
pageOfCurrentReaderPDF = 0;
}
outputStream.flush();
document.close();
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
} finally {
if (document.isOpen())
document.close();
try {
if (outputStream != null)
outputStream.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
}

The error message we get.

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:95)

at com.aspose.pdf.kit.po.a(Unknown Source)

at com.aspose.pdf.kit.po.(Unknown Source)

at com.aspose.pdf.kit.oi.(Unknown Source)

at com.aspose.pdf.kit.oi.(Unknown Source)

at com.aspose.pdf.kit.PdfFileEditor.a(Unknown Source)

at com.aspose.pdf.kit.PdfFileEditor.concatenate(Unknown Source)

I also tried using append method but it is unable to append pages to the same target file

JVM size increased to 1024 and using Aspose.pdf 4.0.0 (new build)

Machine details

-------------------

Windows XP

Jdk 1.5

Thanks,

Raji R

Hi Raji,

Please also share the input PDF files with us as it is very important for our team to reproduce and understand the issue using your particular scenario.

We’re sorry for the inconvenience and appreciate your cooperation.
Regards,

Hi Shahzad,

The sample file was shared as an attachment in the email yesterday.

Thanks,
Raji R

Hi,

I had sent the file via email but uploading it again.

One question why would the API be dependant on the type of PDF file being used.

Just create copies of this file to produce 2500 pages and you can test run the code.

Thanks,

Raji R

Hi Raji,

Thank you very much for sharing the sample PDF file with us. I have reproduced the issue at my end and logged it as PDFKITJAVA-31753 in our issue tracking system. Our team will investigate this issue and you’ll be updated via this forum thread once it is resolved.

Also, sometimes due to the particular content or structure of the PDF file a particular issue might occur. We try to cater all the possible scenarios, but still sometimes a unique situation might produce such an issue. However, our team will be looking into it and provide you the resolution.

We’re sorry for the inconvenience.
Regards,

Hi,

We have to deliver this solution by Tuesday. Can you please let me know by when you can send me the updates on this issue?

Thanks,
Raji R

Hi Raji,

Sorry for replying you late. As we just have been able to notice this problem, we need to investigate this issue in details and I am afraid its pretty difficult to share the exact timelines by which this problem will be fixed. However I have requested the development team to share the ETA regarding its resolution, as soon as they get to know. Please be patient and spare us little time. We apologize for this delay and inconvenience.