Slightly differences in formatting when converting DOCX to HTML and back to DOCX

Hi there.
I’m using Aspose.Words for .NET and my application must support the following workflow:

  1. Convert a DOCX file to HTML;
  2. Manipulate the file content via HTML (optional step);
  3. Convert HTML to DOCX;

We managed to keep original formatting, page options and all, but the output DOCX is still slightly different than the original one, even if there is no HTML manipulation or content edit.

Example:
Paragraphs with same Font, Size and Style are displayed differently in each document, what leads to incorrect page/section breaks.

Here is my Input and Output documents (with no content edits in this case): input-output-words.zip (514.6 KB)

And here is a sample project containing my convert code: convertion-src.zip (127.5 KB)

My Aspose.Words version is 18.9.0

Please help me out, since my customers are unable to use my application due to this issue.

@dionisioleonardo,

Please also provide comparison screenshot(s) highlighting all the problematic areas in Aspose.Words generated final output Word document and attach them here for our reference. We will then investigate the issue(s) on our end and provide you more information. Thanks for your cooperation.

Here they are:

Screenshot_1.jpg (385.3 KB)

In this screenshot you can see:
A) The first paragraph is displayed slightly differently in the output document
B) The content doesn’t fit in the page

Which leads to a incorrect page break, as you can see in this other screenshot:
Screenshot_2.jpg (285.9 KB)

This problem happens in all the document, not just the first page. I only took screenshots from the first page as an example.

@dionisioleonardo,

For the sake of correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-17474. We will further look into the details of this problem and will keep you updated on the status of this issue. We apologize for your inconvenience.

Hi there.

Any updates on this?

Thanks!

@dionisioleonardo,

Regarding WORDSNET-17474, we have completed the analysis of this issue and the root cause has been identified. We have also implemented a fix but currently, we are reviewing/testing all code changes that were done in the scope of this issue. We will inform you via this thread as soon as your issue is resolved. We apologize for any inconvenience.

Hi there.

Any ETA for this fix?

Thanks.

@dionisioleonardo,

We have good news for you i.e. WORDSNET-17474 has now been resolved and the fix of this issue will be integrated in next release of Aspose.Words i.e. 18.12. We will inform you via this thread as soon as 18.12 release of Aspose.Words will be published within next couple of days.

The issues you have found earlier (filed as WORDSNET-17474) have been fixed in this Aspose.Words for .NET 18.12 update and this Aspose.Words for Java 18.12 update.