Converting HTML with two tables stacked to DOCX results in a hidden paragraph being inserted between the tables

Hi!

Software (incl. versions) where issue can be reproduced:
We’re using NUGET package “Aspose.Words” version 23.10.0 for converting HTML to DOCX using DocumentBuilder.InsertHtml. The problem can also be reproduced by using HTML To DOCX Converter Free. HTML To DOCX Online to convert HTML to DOCX files.

The problem is that when converting an HTML fragment which contains two tables that follow each other to DOCX, then in the DOCX between these two tables a paragraph with hidden font is inserted. (Can be seen by showing formatting symbols in the document with Show/Hide ¶). I have provided two html fragments:

Fragment 1:

<table>
    <td>
        <p> Some data </p>
    </td>
</table>
<table>
    <td>
        <p> Some other data </p>
    </td>
</table>

Fragment 2:

<table>
    <td>
        <p> Some data </p>
    </td>
</table>
<p></p>
<table>
    <td>
        <p> Some other data </p>
    </td>
</table>

When these HTML fragments are converted using Aspose to DOCX Test1.docx, Test2.docx (Test1.docx (7,8 KB), Test2.docx (7,8 KB)) a paragraph is added between the two tables with a hidden font (can be seen in WORD by having formatting marks visible using Show/Hide Paragraph Show / Hide button in Word), which was not there in the HTML fragments.

From what I have looked, it seems that DOCX doesn’t really allow two tables stacked on each other and that is why the hidden paragraph is inserted between when converting. Is this right?

Either way, the second fragment has in the HTML an empty paragraph <p></p> added between the tables which should be also visible inside the converted DOCX as an empty paragraph, but instead they also have in the DOCX just a paragraph with a hidden font, which does not match the original HTML fragment where the paragraph creates an empty line between the two tables.

Thanks in advance!

@troubledog You are right MS Word document format does not allow the table to follow each other. There must be a paragraph after the table, otherwise the tables are treated as a single table. To prevent this Aspose.Words adds an empty hidden paragraph between tables to keep them separate, since concatenation of table might lean to table layout issues. In this case Aspose.Words also mimics MS Word behavior. If you open you test HTML fragment in MS Word you will see the same hidden empty paragraph between the tables.

1 Like