Hi,
We noticed differences in line breaks between the Word document and the result saved as HTML_FIXED.
Input file:
mini.docx (3.7 MB)
Code:
private void convertDocxToHtml(String docxPath, String htmlPath) throws Exception {
Document doc = new Document(docxPath);
doc.save(htmlPath, new HtmlFixedSaveOptions());
}
Document view:
Html view:
Unfortunately, such differences in appearance are not acceptable to us.
- What is the cause of such behavior?
- Is it possible to prepare the document in such a way that the line break does not change?
Aspose Words for Java v23.7
Docx edited and viewed in: Microsoft® Word for Microsoft 365 MSO (version 2409 build 16.0.18025.20160) 32-bit
xhtml viewed in: Chrome 129.0.6668.101 (Official verison) (64-bit)
@mariusz.bajan The problem occurs because advanced typography features are used in your document. Advanced typography features are supported by Aspose.Words via Aspose.Words.Shaping.HarfBuzz package.
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>24.4</version>
<classifier>shaping-harfbuzz-plugin</classifier>
</dependency>
You should install the above package and modify the code as shown below:
Document doc = new Document("C:\\Temp\\in.docx");
doc.getLayoutOptions().setTextShaperFactory(com.aspose.words.shaping.harfbuzz.HarfBuzzTextShaperFactory.getInstance());
doc.Save(@"C:\Temp\out.pdf");
This work properly for PDF, but is not yet supported for HmlFixed.
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): WORDSNET-27524
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
@mariusz.bajan You can use the following code as a workaround:
Document doc = new Document("C:\\Temp\\in.docx");
doc.getLayoutOptions().setTextShaperFactory(com.aspose.words.shaping.harfbuzz.HarfBuzzTextShaperFactory.getInstance());
// Update page layout.
doc.updatePageLayout();
HtmlFixedSaveOptions opt = new HtmlFixedSaveOptions();
opt.setPrettyFormat(true);
opt.setExportEmbeddedCss(true);
opt.setExportEmbeddedFonts(true);
opt.setExportEmbeddedImages(true);
opt.setExportEmbeddedSvg(true);
doc.save("C:\\Temp\\out_html_fixed.html", opt);