Content are pushed up to page after DOCX to PDF conversion using .NET

Hi,

We experience issues with Aspose PDF adding strange page breaks not present in the original Word file. When using Word’s built in PDF Converter to convert from docx to PDF everything works just fine.
This issue occurs from time to time in docx files containing several (> 20) pages.

Thank you for your assistance!

Regards / Mattias

@mathog

Would you kindly share your sample source and output files for our reference. Also, please share complete sample code snippet that you are using to convert DOCX to PDF as it is performed by Aspose.Words, not Aspose.PDF.

Sorry for the delay. I had to conduct some further investigations regarding this matter. I am attaching a PPT describing the issues. It’s in swedish but I think the pictures shows the issue clearly.

Here is the code we are using:

        var wordLic = new Aspose.Words.License();
        wordLic.SetLicense("Aspose.Total.lic");

        MemoryStream pdfStream = new MemoryStream();
        WordSaving.PdfSaveOptions saveOptions = new WordSaving.PdfSaveOptions();
        saveOptions.Compliance = WordSaving.PdfCompliance.PdfA1b;
        saveOptions.UseHighQualityRendering = true;

        using (Stream read = listItemVersion.ListItem.File.OpenBinaryStream())
        {
            Words.Document dokument = new Words.Document(read);

            dokument.Save(pdfStream, saveOptions);
        }

        return ConvertToPDFA(pdfStream);


    private static MemoryStream ConvertToPDFA(MemoryStream inputStream)
    {
        MemoryStream stream = new MemoryStream();

       var pdfLic = new Aspose.Pdf.License();
        pdfLic.SetLicense("Aspose.Total.lic");
        pdfLic.Embedded = true;

        Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(inputStream);
        bool result = pdfDocument.Convert(new MemoryStream(), PdfFormat.PDF_A_1A, ConvertErrorAction.Delete);
        pdfDocument.Save(stream);
        stream.Flush();
        stream.Position = 0L;
        return stream;
    }

TDOK revidering 2.zip (727.8 KB)

@mathog

Thanks for sharing the sample code and screenshots.

As requested earlier, would you kindly share sample source Word file with us as well. This would help us testing the scenario and address it accordingly.

Attached is the original docx.TDOK 2019-0478.zip (9.7 MB)

@mathog

We have tested the scenario in our environment and it seems your issue is related to Aspose.Words. The PDF generated by Aspose.Words contains the issues which you have shown in the screenshots. We are moving this thread to respective forum category where you will be assisted accordingly.

@mathog

We have tested the scenario and noticed that the contents of page 43 and 44 are rendered incorrectly. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-20584 . You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

Thank you for your investigation. It seems like this issue occurs when a picture or object is larger than the defined margins of the document. So if we have a picture at the end of a page, which doesn’t fit into the defined margins of the page this will generate a page break.

This issue does not occur when using the Microsoft Word Conversion Service to create a PDF…

@mathog

Thanks for sharing the detail. We have logged this detail in our issue tracking system. Yes, this issue does not appear when document is converted to PDF using MS Word.