We are receiving documents created in many external systems which we convect to PDF before loading into our central document store. This has worked fine in most cases except for the latest source which is providing documents in XHTML format where blocks of justified text are going a tad squiffy.
In the example attached is a test example of an original document we receive from this system, how it is displayed to the users in IE/Chrome when they view it in the source system and a PDF that was created using Aspose Words.
As you can see in the PDF the individual lines under “Diagnosis” are fully justiying which we wouldn’t expect being single lines and the last line of the paragraph below them is also justifying across the page where we would expect.
Thank you in advance for any assistance you can provide.
SanitisedExample.zip (351.2 KB)
Thanks for your inquiry. We have tested the scenario and noticed the reported issue. We have logged a ticket WORDSNET-15597 in our issue tracking system for further investigation and rectification. We will notify you as soon as it is resolved.
We are sorry for the inconvenience.
Thanks for your patience. We have investigated the issue and found Text is justified differently in HTML and MS Word in case it contains line breaks (br). In HTML lines of such text look left-aligned but MS Word expands them to fill the whole line.
MS Word’s behavior can be changed. If we set Document.CompatibilityOptions.DoNotExpandShiftReturn to true, MS Word will not expand lines that end with a line break. As a workaround, you can set this option manually after the document is imported from HTML as following. Hopefully it will help you to resolve the issue.
Document doc = new Document("SanitisedExample.xhtml");
doc.CompatibilityOptions.DoNotExpandShiftReturn = true;
The issues you have found earlier (filed as WORDSNET-15597) have been fixed in this Aspose.Words for .NET 17.9 update and this Aspose.Words for Java 17.9 update.