Word files saved into HTML produce different result on different machines

If a DOCX files is saved into HTML, it produces different results on different machines.

This issue is breaking our CI processes.

Could you help in figuring out what might cause this difference?

The local settings are the same on both machines (EN-US).

Attachments:
The source docx file.
baseFile.docx (48.8 KB)

The code that is used for saving the file.

public void CreateHTMLFromDocument(string filePath, string previewFolderName)
{
    HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.Html)
    {
        ImagesFolder = Path.Combine(previewFolderName, "doc_files"),
        ImagesFolderAlias = "doc_files",
        ExportHeadersFootersMode = ExportHeadersFootersMode.FirstSectionHeaderLastSectionFooter
    };
    AsposeLicenseInit.AssureWordsLicenseSet();
    Aspose.Words.Document doc = new Aspose.Words.Document(filePath);
    doc.Save(Path.Combine(previewFolderName, "doc.html"), options);
}

The files have slightly different margin values in them.

Do you have any idea what might cause these margin values to not be consistent?

We see that the discrepancy is not large, but due to the fact that the value is not identical, this breaks our automation pipelines and also customer expectation is that their resulting files should be identical on different machines.

Thanks in advance!

@memoq

It sounds like you’re experiencing inconsistencies when converting DOCX files to HTML across different machines, which can disrupt your CI processes. Let’s address this issue systematically.

Cause: The inconsistencies may stem from several factors, including differences in Aspose.Words versions, environmental settings, document properties, or even the code used for conversion.

Solution: Here are some steps to troubleshoot the issue:

  1. Aspose.Words Version: Verify that both machines are using the same version of Aspose.Words, as different versions can lead to variations in document conversion.

  2. Environment Differences: Check for discrepancies in the environment, such as installed fonts, operating system updates, or other software that might influence rendering.

  3. Document Properties: Examine the properties of the DOCX files. Embedded styles or specific formatting can sometimes result in different outputs.

  4. Code Consistency: Ensure that the code used for conversion is identical on both machines. Even minor differences can lead to varying results.

  5. Testing with a Simple Document: Convert a simple DOCX file on both machines to determine if the issue persists. This can help isolate whether the problem is with specific documents or the conversion process itself.

Next Steps: If the issue continues, consider logging the conversion process or using a tool to compare the generated HTML files for differences. This will help identify the root cause of the inconsistencies.

Please let me know if you need further assistance or if this resolves your issue!

We have checked the above list. The uploaded sample is a relatively small document, and the discrepancy persists. As we mentioned in the original post, it is the margin settings that differ in the two outputs, but that offers me no clue on what might cause this difference.
The file uses Time New Roman fonts, that are pretty standard everywhere.

@memoq Could you please provide environment details where the different HTML files are produced? We will try to reproduce the same on our side. Also, please attach your problematic output HTML files here for our reference.

@alexey.noskov Thanks for picking this up. I can provide any details you deem relevant. Can you provide a list of parameters, that you consider useful?

In the meantime I will collect the exact outputs for you.

@memoq I will consult with the development team regarding the possible reasons for the difference and get back to you soon.

sample.zip (51.7 KB)

@alexey.noskov in the meantime, here are the specs and concrete files for the issue.

Machine 1: WINDOCKER5
Default Locale (Culture): en-US
Environment Details:
Machine Name: D7D35C1CB852
OS Version: Microsoft Windows NT 10.0.20348.0
OS Platform: Win32NT
OS Architecture: Microsoft Windows 10.0.20348
Processor Count: 48
System Directory: C:\Windows\system32
User Domain Name: User Manager
User Name: ContainerAdministrator
Tick Count: -2131969500
Is 64-bit Operating System: True
Is 64-bit Process: True
CLR Version: 4.0.30319.42000

Machine 2: Shubus02
Default Locale (Culture): en-US
Environment Details:
Machine Name: SHUBUS01
OS Version: Microsoft Windows NT 10.0.14393.0
OS Platform: Win32NT
OS Architecture: Microsoft Windows 10.0.14393
Processor Count: 12
System Directory: C:\Windows\system32
User Domain Name: MEMOQ
User Name: SYSTEM
Tick Count: -2137448656
Is 64-bit Operating System: True
Is 64-bit Process: True
CLR Version: 4.0.30319.42000

The source docs, and the two saved html files.
base_file.docx (48.8 KB)

@memoq Thank you for additional information. The difference can be caused by difference in fonts available in your environments.

If Aspose.Words cannot find the font used in the document, the font is substituted . This might lead into fonts mismatch and document layout differences due to the different fonts metrics. You can implement IWarningCallback to get notifications when font substitution is performed.
Please see our documentation to learn where Aspose.Words looks for fonts:
https://docs.aspose.com/words/net/specifying-truetype-fonts-location/