Problem reading HTML td cells containing "BR" tags


#1

I have a simple program which converts HTML files into PDF files using Words (version 3.5.3.0). Whenever an HTML file has a table which has a "td" cell which contains a "br" tag, then the resulting table is displayed incorrectly. Specifically, the width of any such td cell is much wider than it should be (twice as wide or more). It appears that the "width" attribute of the cell of the resulting table is being set to the value it would have if there were no word wrap caused by the "br" tag. Hence, the more "br" tags within the cell, the more pronounced this effect becomes.

Here is some VB.Net code I am using to recreate the problem - it is saving to HTML format rather than PDF due to simplicity, but the results are the same:

doc = New Document(HtmlFileBefore, LoadFormat.FormatHtml, "")

doc.Save(HtmlFileAfter, SaveFormat.FormatHtml)

Here is the HTML file text (just a test file to recreate the problem):

TABLE #1 [BR tags in 1st cell]










R1 C1
This
cell
has
several
break
tags
R1 C2
R2 C1 R2 C2


TABLE #2 [No BR tags in cells]










R1 C1 R1 C2
R2 C1 R2 C2


If you compare the contents or the display of this HTML file with that of the resulting HTML file, you will see the problem.

Thanks,

Brett


#2

Hi Brett,

Thank you so much for the report. This issue has been logged as #927. Please give us some time to investigate it and fix.


#3

Sorry there is no fix for this in the upcoming Aspose.Words 3.6, but we will look at the issue in the future.