We are seeing an issue related to HTML import - where the HTML contains a table with fixed column widths. When the HTML is added to a document and the table’s auto-fit property is disabled, the column widths of the imported table are not consistent with the widths specified in the source HTML. This causes unexpected word-wrap to occur in certain cells.
This behavior can be seen in the latest Aspose Words for Java version 25.8, the attached WordWrap.html and WordWrap.docx files and the following Java code:
final String htmlSrc = "WordWrap.html.html";
// create a new Document and DocumentBuilder
Document wdDoc = new Document("WordWrap.docx");
DocumentBuilder builder = new DocumentBuilder(wdDoc);
builder.moveToDocumentEnd();
// load and insert the HTML table into the doc
String html = new String(Files.readAllBytes(Paths.get(htmlSrc)), "UTF-8");
builder.insertHtml(html, false);
// Disable auto-fit from the imported table
Table table = (Table) builder.getCurrentParagraph().getPreviousSibling();
table.setAllowAutoFit(false);
final String newDoc = htmlSrc.replace(".html", "-with-table.docx");
Files.deleteIfExists(Paths.get(newDoc));
wdDoc.save(newDoc);
System.out.println("Saved Word Document: " + newDoc);
Running the above code should produce a DOCX similar to the attached WordWrap-with-table.docx file.
Key Observations:
Opening the WordWrap.html file in a Browser (such as Firefox or Edge) - shows the expected column widths and word-wrap in the cells.
Opening the generated WordWrap-with-table.docx file in MS Word shows:
The width of the first column of the table is slightly smaller that the width defined in the HTML (i.e. 3.15" vs 3.14")
As a result of the shorter column, the heading cells in rows 4 & 6 have unexpected word wrap (highlighted in yellow)
Environment Details:
Aspose Words for Java 25.8.0
Java version 1.8
Windows 11 OS (but also reproducible under Linux).
It sounds like you’re encountering an issue with the column widths and word-wrap behavior when importing HTML tables into a DOCX document using Aspose.Words for Java. This can occur when the auto-fit property is disabled, leading to discrepancies between the specified widths in the HTML and the rendered widths in the generated document.
Cause: The issue is likely due to the auto-fit property being disabled, which prevents the table from adjusting its column widths based on the content and specified HTML widths.
Solution: Here are several steps you can take to troubleshoot and resolve this issue:
Check HTML Table Structure: Ensure that the HTML table is well-structured and that the widths are explicitly defined in the HTML. Additional CSS styles can sometimes affect rendering.
Set Column Widths Manually: After inserting the HTML, manually set the widths of the columns in the Aspose table to match those defined in the HTML. This will help maintain consistency.
Here’s a code snippet to set the column widths:
for (int i = 0; i < table.getRows().get(0).getCells().getCount(); i++) {
table.getColumns().get(i).setPreferredWidth(PreferredWidth.fromPoints(3.15f)); // Set to your desired width
}
Test with Different Versions: Since you are using version 25.8, consider testing with the latest version of Aspose.Words for Java. There may have been bug fixes or improvements related to HTML import and table rendering.
Review Word Document Settings: Sometimes, settings in MS Word can affect how documents are displayed. Ensure that there are no overriding styles or settings in Word that could be causing the issue.
Contact Support: If the problem persists, reach out to Aspose support with your specific files and code. They can provide more targeted assistance.
Next Steps: Implement the suggested solutions and test the results. If the issue is resolved, great! If not, gather your files and code and contact Aspose support for further help.
Please let me know if these steps help or if you need further assistance!
Please note that as of the latest Aspose Words for Java version - there is no Table::getColumns API. So the code proposed will not even compile!
But even if we teak the code to use the suggested setPreferredWidth API - it is not a feasible solution. We do not want to perform post-processing of the inserted table. Additionally, the suggested code does not seem to take into account things like:
Variable column widths. The only thing we have to work with is the HTML. As such, in order to implement the suggested solution, we would need to parse and interpret the HTML in order to determine the column widths specified therein.
Merged (horizontal and vertical) cells. While we haven’t tested it, wouldn’t the suggested code throw an exception if there are merged cells?
@oraspose In your source HTML width of columns is specified in pixels, but in MS Word document size is specified in twips (1/20 of point), which are absolute measurement unit. Please try specifying sizes in your HTML in points.
You should note, that HTML documents and MS Word documents object models are quite different and it is not always possible to provide 100% fidelity after conversion one format to another.
You are right, the px should be specified as pt - my apologies. However, even when the px units are converted to pt the results are the same. To be clear, the use of px was a mistake, the HTML should be:
I certainly understand that HTML and Word does not provide 100% fidelity and there is always the possibility of rounding issues. But I guess what we’re really trying to determine is if there is a way to get the output showed by the Browsers in the Word document?
The solution provided by the AI response is an option, but we’ve provided reasons why we don’t want to perform the post processing on the table. Is there any way the insertHtml API could potentially handle that for the client (with a new flag for example)?
You can see that text in the browser is a little shorter than in MS Word document. It looks like browser and MS Word handles fonts differently.
Unfortunately, HTML and MS Word documents are quite different and it is impossible to get exactly the same visual representation in browser and in MS Word document.
Hi, i ran into smth similar. So the issue was not the table width itself but how Word normalizes cell borders and internal margins on import. Even a 0.01" mismatch cascaded into wrap changes. What helped me was explicitly settin border-collapse:collapse & defining cellpadding=0 inline in the HTML before insertion, which made the DOCX rendering align much closer to browser output. Hopefully this helps