Blank Page added between pages - Source:PDF bytes, Destination - PDF file

I’m trying to convert PDF byte array(Source is pdf file, converted to byte[]) to PDF file after appending few documents. The source file has 4 pages. I’m using the below code to achieve this. It gives me an additional blank page(Total pdf pages - 5) when I am converting the byte[] to a PDF file. I shall share the source file if required

com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(new ByteArrayInputStream(<pdf byte[]>);
ByteArrayOutputStream baOs = new ByteArrayOutputStream();
pdfDocument.save(baOs, com.aspose.pdf.SaveFormat.DocX);
com.aspose.words.Document wordDoc = new com.aspose.words.Document(
new ByteArrayInputStream(baOs.toByteArray()));
com.aspose.words.Document wordDocumentHelper = documentBuilder.getDocument();
wordDocumentHelper.appendDocument(wordDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
documentBuilder = new DocumentBuilder(wordDocumentHelper);

Final PDF -Pages 1 and 2 have expected content. Page 4 is blank. Pages 3-6 should be having the source pdf content but otherwise

sample (96).pdf (773.0 KB)
Source PDF file.

We are using licensed version of Aspose API
18.10 - Aspose PDF
18.4 - Aspose word

Please guide on fixing the issue.

@hai2xavier

Thanks for your inquiry. We have tested the scenario using the latest version of Aspose.PDF for Java 18.12 and Aspose.Words for Java 19.1 with following code example. We have not found the blank page issue. Please use the latest version of Aspose.PDF and Aspose.Words.

com.aspose.pdf.Document pdf = new com.aspose.pdf.Document(MyDir + "sample (96).pdf");
pdf.save(MyDir + "out.docx", com.aspose.pdf.SaveFormat.DocX);

com.aspose.words.Document doc = new com.aspose.words.Document(MyDir + "out.docx");
doc.save(MyDir + "19.1.pdf");

Can you please confirm if the issue is with the version that we use(18.10 - Aspose PDF &
18.4 - Aspose word)? Our systems are using the aforementioned versions and the upgrade would be quite difficult as it would be having bigger impact.

@hai2xavier

Thanks for your inquiry. We have tested the scenario using the Aspose.Words for Java 18.4 and Aspose.PDF for Java 18.10. We have not found the shared issue.

The problematic PDF you shared has different content from ‘sample (96).pdf’. To ensure a timely and accurate response, please attach the following resources here for testing:

  • Your input PDF that you are using.
  • Please create a simple Java application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Thanks Tahir. The use case here is as follows.

  1. Receive pdf byte[] (byte array representation of a PDF document) from different sources and append the content in a word document using Aspose DocumentBuilder/Document object and convert the final content to PDF byte[] and pass it as response.

In your sample code that you shared, you are directly reading the file from a directory path which is not our case. The pdf byte[] that I’m talking about would be coming as a request to Back-end service. Nevertheless, for reproducing the issue you shall run the sample application that is attached.

I have attached the input pdf, Sample Java application and the generated pdf and word documents FYR to replicate the issue that we face. sample.zip (2.7 MB)

A post was split to a new topic: Exception in Unix environment

@hai2xavier

Thanks for sharing the detail. You are facing this issue due to section break (new page) in LocalGeneratedWord_Intermediate.docx document. Please remove the section breaks from the document to get the desired behavior.

Please check the RemoveSectionBreaks method in following code example. We have attached the output PDF with this post for your kind reference. OutputPdf.zip (1013.5 KB)

byte[] caseDocBytes = null;
com.aspose.words.Document document = new com.aspose.words.Document();
DocumentBuilder documentBuilder = new DocumentBuilder(document);
caseDocBytes = Files.readAllBytes(Paths.get(MyDir + "Sample.pdf"));
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(new ByteArrayInputStream(caseDocBytes));
ByteArrayOutputStream pdfBaos = new ByteArrayOutputStream();
pdfDocument.save(pdfBaos, com.aspose.pdf.SaveFormat.DocX);
pdfDocument.save(MyDir + "LocalGeneratedWord_Intermediate.docx", com.aspose.pdf.SaveFormat.DocX);

com.aspose.words.Document wordDoc = new com.aspose.words.Document(
        new ByteArrayInputStream(pdfBaos.toByteArray()));

RemoveSectionBreaks(wordDoc);
documentBuilder.insertDocument(wordDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
document.save(MyDir + "output.pdf", SaveFormat.PDF);

private static void RemoveSectionBreaks(Document doc)
{
    // Loop through all sections starting from the section that precedes the last one
    // and moving to the first section.
    for (int i = doc.getSections().getCount() - 2; i >= 0; i--)
    {
        // Copy the content of the current section to the beginning of the last section.
        doc.getLastSection().prependContent(doc.getSections().get((i)));
        // Remove the copied section.
        doc.getSections().get(i).remove();
    }
}