When converting legal documents form Word to PDF, we’re getting line and/or paragraph spacing issues and we cannot determine the cause or even a workaround in the Word document. This is resulting in a different number of pages between DOCX and PDF. Pages are breaking across different paragraphs which makes headers and page breaks not be aligned in the location the user would expect. Having exact or very similar PDF output is essential for legal documents. Even the number of pages matters very much for legal applications.
After user’s upload a DOCX file, we’re saving it with a similar code snippet and no further document manipulation:
File inputFile = new File(filePath);
File resultFile = new File(filePath.replace(".docx", ".pdf"));
System.out.println("Creating new PDF document: " + resultFile.getAbsolutePath());
try (FileInputStream fis = new FileInputStream(inputFile))
{
Document wordDocument = new Document(fis);
PdfSaveOptions saveOptions = new PdfSaveOptions();
saveOptions.setTextCompression(PdfTextCompression.FLATE);
saveOptions.setImageCompression(PdfImageCompression.JPEG);
saveOptions.getDownsampleOptions().setDownsampleImages(true);
saveOptions.getDownsampleOptions().setResolution(144);
saveOptions.setJpegQuality(90);
saveOptions.setCompliance(PdfCompliance.PDF_A_1_B);
saveOptions.setMemoryOptimization(true);
saveOptions.setTempFolder(new File(System.getProperty("java.io.tmpdir")).getAbsolutePath());
wordDocument.save(resultFile.getAbsolutePath());
}
I have create a fully standalone sample code project and can attach the source code, the source document, an Aspose.Words converted PDF document, and a PDF document created when Word saves as PDF. I would expect all of these to have paragraphs on the same pages and the same number of pages in each document.
Also, the example document I’m uploading has all fonts (Century Schoolbook) embedded, so I don’t believe font substitution should not be an issue. I have tried removing the PDF/A conversion code and the spacing issue is still a problem.
wordconvert.zip (2.7 KB)
20-0681 Nelson.docx (247.8 KB)
20-0681 Nelson (Aspose).pdf (115.2 KB)
20-0681 Nelson (Word).pdf (158.2 KB)