Insert HTML String with Table in Word DOCX Document Java | Avoid Inheriting First Cell's Left Padding Value as Table Indentation

Hello,

When a table is added to a Document from HTML, its indent property is inherited from the left padding specified on the top-left-most cell of the HTML table. Specifically, when the first <td> of the HTML table contains a padding-left style (greater than zero), the resulting table in the Word document becomes indented by the left padding amount. We expect that the left-padding style from the first (top-left) cell should not affect the entire table.

This behavior can be seen in the latest Aspose Words for Java version 20.8, the attached FirstCellIndent-base.docx & FirstCellIndent.html files and the following Java code:

final String docxSrc = [PATH] + "FirstCellIndent-base.docx";
final String htmlSrc = [PATH] + "FirstCellIndent.html";

// load the base document
Document wdDoc = new Document(docxSrc);

// load the HTML file
String html = new String(Files.readAllBytes(Paths.get(htmlSrc)), "UTF-8");

// add the HTML to the end of the doc
DocumentBuilder builder = new DocumentBuilder(wdDoc);
builder.moveToDocumentEnd();
builder.insertHtml(html, false);

final String newDoc = htmlSrc.replace(".html", ".docx");
Files.deleteIfExists(Paths.get(newDoc));
wdDoc.save(newDoc);

System.out.println("Created new Document:  " + newDoc);

Running the above code should produce a new Document similar to the attached FirstCellIndent.docx file. If you open the new document in Word, there should be two tables, where the top table is indented and the second table is not. The only difference between the two tables (per the source HTML), is that the first cell in the top table specifies the padding-left:25.0px style. But this style affects the entire table by indenting it.

Note: This may be related to this thread, but that example uses cellpadding at the table level - whereas this uses padding-left at the cell level.

Environment Details:

  • Aspose Words for Java 20.8
  • Java version 1.8.0_211
  • Windows 10 OS (but also reproducible under Linux).

File description in the FirstCellIndent.zip (20.5 KB) attachment contains:

  • FirstCellIndent-base.docx: Base document that will consume the HTML and used by the code above.
  • FirstCellIndent.html: HTML file that will be imported into the target Document.
  • FirstCellIndent.docx: Word document produced from the code above on our environment and demonstrating the indented table.

Thank you!

@oraspose,

While using the latest version of Aspose.Words i.e. 20.8, we managed to reproduce this issue on our end. We have logged this issue in our bug tracking system with ID WORDSNET-21052. Your thread has also been linked to this issue and you will be notified here as soon as it will get resolved in future. Sorry for the inconvenience.