Missing whitespaces in raw HTML when rendering justified text layout from Word document

Hi, I’ve come across a rendering issue when converting Word documents to fixed layout HTML.

The Word document uses “justify text” to format its paragraphs. When converted to fixed layout HTML, the rendering matches the one of the Word document, but copying the text in the HTML results in a selection where almost all whitespaces between words have been dropped.

Here’s a minimal reproduction document: justifiedText.docx (11.7 KB)

And here are some screenshots to help visualise the problem.

Being able to select the text value from the HTML rendering is really important to us.

It seems to me that whitespaces are only inserted in the fixed layout HTML if the space between words in the Word document is more or equal to the space needed for a whitespace.
As all the words are absolutely positioned, it feels to me like there is no harm in appending a non-breaking whitespace   at the end of each word. If there is less than the required space for a whitespace between 2 words, the second word will simply slightly overlap over the whitespace.

Let me know if you require more details.
Thanks in advance.

@njlgad,

Please try the following code of latest 21.5 version of Aspose.Words for .NET and see if this resolves the issue on your end?

Document doc = new Document("C:\\temp\\229200\\justifiedText.docx");

HtmlFixedSaveOptions options = new HtmlFixedSaveOptions();
options.ExportEmbeddedFonts = true;
options.AllowEmbeddingPostScriptFonts = true;

doc.Save("C:\\temp\\229200\\21.5.html", options);

I opened above HTML with Chrome, Firefox and Internet Explorer. Then copied all text (Ctrl+A & Ctrl+C) from these web browsers and pasted in Notepad. I have not noticed any such issue in Notepad i.e. spacing is properly preserved in this case.

Hi,

Thanks for your reply. The output from Aspose is indeed correct. We just found out that some post-processing of the HTML document is causing the issue with the whitespaces.

@njlgad,

It is great that you were able to resolve this issue on your end. Please let us know any time you may have any further queries in future.