Hi!
Software (incl. versions) where issue can be reproduced:
We’re using NUGET package “Aspose.Words” version 23.10.0 for converting HTML to DOCX using DocumentBuilder.InsertHtml
. The problem can also be reproduced by using HTML To DOCX Converter Free. HTML To DOCX Online to convert HTML to DOCX files.
The problem is that when converting an HTML fragment which contains two tables that follow each other to DOCX, then in the DOCX between these two tables a paragraph with hidden font is inserted. (Can be seen by showing formatting symbols in the document with Show/Hide ¶
). I have provided two html fragments:
Fragment 1:
<table>
<td>
<p> Some data </p>
</td>
</table>
<table>
<td>
<p> Some other data </p>
</td>
</table>
Fragment 2:
<table>
<td>
<p> Some data </p>
</td>
</table>
<p></p>
<table>
<td>
<p> Some other data </p>
</td>
</table>
When these HTML fragments are converted using Aspose to DOCX Test1.docx, Test2.docx (Test1.docx (7,8 KB), Test2.docx (7,8 KB)) a paragraph is added between the two tables with a hidden font (can be seen in WORD by having formatting marks visible using Show/Hide ), which was not there in the HTML fragments.
From what I have looked, it seems that DOCX doesn’t really allow two tables stacked on each other and that is why the hidden paragraph is inserted between when converting. Is this right?
Either way, the second fragment has in the HTML an empty paragraph <p></p>
added between the tables which should be also visible inside the converted DOCX as an empty paragraph, but instead they also have in the DOCX just a paragraph with a hidden font, which does not match the original HTML fragment where the paragraph creates an empty line between the two tables.
Thanks in advance!