Conversion from docx to html doesn't show html exactly like docx document

Hello,

Currently, we are trying to convert word(docx/doc) file to html with the help of nuget package of Aspose.Words with version number 24.7.0.

Link: NuGet Gallery | Aspose.Words 24.7.0

For the conversion, we are using HtmlFixedSaveOptions as we want html to look same as the word file as shown in below code.

HtmlFixedSaveOptions htmlFixedSaveOptions = new HtmlFixedSaveOptions();
htmlFixedSaveOptions.ExportEmbeddedCss = true;
htmlFixedSaveOptions.ExportEmbeddedFonts = true;

var wordtohtml = new Aspose.Words.Document(stream);
wordtohtml.Save(path, htmlFixedSaveOptions);

But this code is not showing content of the page exactly like it is in word file.

For reference, I have three files.

  1. Original Document(TestDocument.docx)
    TestDocument.docx (13.5 KB)

  2. Converted Document(1Output_d6888968-e47a-42e4-86ec-599a33b4b2bd.zip) which has 1Output_d6888968-e47a-42e4-86ec-599a33b4b2bd.html file.
    1Output_d6888968-e47a-42e4-86ec-599a33b4b2bd.zip (23.0 KB)

  3. Screenshot(html-conversion.jpg)

If you observe screenshot file then at the end of first page number of lines in actual document(docx) and in converted document(html) is different.
We would like to have the html exactly like displayed in docx file.
Please suggest a way to achieve such output.

@ankitchhelavda This is a font issue. During saving to HtmlFixed there is a warning
"Font substitution: Font 'Aptos' has not been found. Using 'Arial' font instead. Reason: font info substitution.".
You need to install this font before converting the document.

If Aspose.Words cannot find the font used in the document, the font is substituted. This might lead into fonts mismatch and document layout differences due to the different fonts metrics. You can implement IWarningCallback to get notifications when font substitution is performed.
Please see our documentation to learn where Aspose.Words looks for fonts:
https://docs.aspose.com/words/net/specifying-truetype-fonts-location/

@vyacheslav.deryushev Is there any generic way that aspose gets the missing fonts from somewhere online and puts it in the folder whose path is given at the time of conversion and then uses that font?

@ankitchhelavda No, unfortunately, there is no such way. You should provide the required fonts yourself. Aspose.Words can only warn if some font is missed.

@alexey.noskov
But Aspose it also not taking the fonts which are installed in system. The document is written with ‘Times New Roman’ font style and this font is installed in the windows system. Still aspose taking fanwood or Arial as font style in html conversion.

Secondly what if the document has multiple font styles?

@ankitchhelavda The fonts should in available to Aspose.Words. You can use the following code to check whet fonts are available:

/// <summary>
/// Prints the fonts avaialble in the specified font settings.
/// </summary>
public static void PrintAvaialbleFonts(FontSettings fs)
{
    foreach (FontSourceBase fsb in fs.GetFontsSources())
    {
        Console.WriteLine(fsb.Type);
        foreach (PhysicalFontInfo pfi in fsb.GetAvailableFonts())
        {
            Console.WriteLine(pfi.FullFontName);
        }
        Console.WriteLine("================================================");
    }
}

You can use any number of fonts in your documents, but all of them should be availabe to build correct document layout.