Convert DOCX to PDF using Java & Support Auto Fit Tables with Cells Spanning Multiple Text Columns | Linux

Hello,

We are seeing an issue where the first column of a table is being suppressed when a Document is converted to PDF.
The following describes the basic flow related to this issue:

  • A new Document is created.
  • HTML containing a table is added to the new document.
    Note: in order to offset the table from the left margin on the page, the first column of the table is empty by design.
  • The Document is saved
  • The Document is reloaded and saved as PDF
  • When the PDF is rendered in a PDF viewer, it shows that the first column from the table is suppressed and the the table is not offset from the left margin as designed.
  • If you open the Document in MS Word the first column is intact and the table shows as expected (i.e. similar to how a Browser renders the HTML).

This behavior can be seen in the latest Aspose Words for Java version 20.8, the attached EmptyColTable.html file and the following Java code:

final String htmlSrc = [PATH] + "EmptyColTable.html";

// create a new Document
Document wdDoc = new Document();

// load & insert the HTML into the doc
String html = new String(Files.readAllBytes(Paths.get(htmlSrc)), "UTF-8");
DocumentBuilder builder = new DocumentBuilder(wdDoc);
builder.moveToDocumentEnd();
builder.insertHtml(html, false);

// Save the new Document
final String newDoc = htmlSrc.replace(".html", ".docx");
Files.deleteIfExists(Paths.get(newDoc));
wdDoc.save(newDoc);
System.out.println("Saved *Word* Document:  " + newDoc);

// Reload the Aspose saved Document and convert it to PDF
wdDoc = new Document(newDoc);
final String newPdf = newDoc.replace(".docx", ".pdf");
Files.deleteIfExists(Paths.get(newPdf));
wdDoc.save(newPdf);
System.out.println("Saved *PDF* Document:  " + newPdf);

Running the above code should produce a new:

  1. Word Document similar to the attached EmptyColTable.docx file. If you open the new document in Word, you should see that the first column of the table is intact and the document appears similar to how a Browser renders the HTML file.
  2. PDF file similar to the attached EmptyColTable.pdf. If you open this file in a PDF viewer, first column from the table is suppressed and the the table is not offset from the left margin as designed.

Environment Details:

  • Aspose Words for Java 20.8
  • Java version 1.8.0_211
  • Windows 10 OS (but also reproducible under Linux).

Workaround:
We’ve found the following workaround:

  1. Open the Word Document in MS Word and save it without making any edits.
  2. Generate the PDF via Aspose Words using the Word-saved DOCX.
    The resulting PDF now appears as expected.

However, this is not a practical workaround, since we cannot rely on MS Word to correct the issue.

File description in the EmptyColTable.zip (63.4 KB) attachment contains:

  • EmptyColTable.html: HTML file that is imported into the new Document per the code above.
  • EmptyColTable.docx: Word document produced from the code above on our environment and shows as expected.
  • EmptyColTable.pdf: PDF file produced from the code above on our environment and shows the suppressed first column.

Thank you!

@oraspose,

We tested the scenario and have managed to reproduce the same problem on our end. For the sake of correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-21082. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-21082) have been fixed in this Aspose.Words for Java 22.2 update.