we are experiencing severe formatting issues when we convert Word documents with Japanese text to HTML.
We are using quite the straight forward code:
HtmlFixedSaveOptions options = new HtmlFixedSaveOptions(); options.setExportEmbeddedImages(true); options.setExportEmbeddedCss(true); options.setExportEmbeddedFonts(true); options.setExportEmbeddedSvg(true); options.setUpdateFields(false); Document doc = new Document(inputStream); doc.save(outputStream, options);
Here is a screenshot from a test document in Word (DOCX) and one from the resulting HTML:
word_document.png (63.8 KB)
html_result.jpg (48.8 KB)
It seems as if most paragraphs are displayed in a single line and thus starting to overlap itself.
It would be great if you could help us - maybe there are save option flags that are necessary to solve those issues?
Thanks a lot,