We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Docx to HTML conversion issue with special characters using C#

Hi Team,

We were trying to convert doc/docx to Html where we faced one issue when converting a doc with Danish language and its characters are not converted as it is in output html.

We are using Aspose.Words for .NET to convert. Below is the code and attached the zip file containing input doc and output html.

Aspose.Words.Saving.HtmlSaveOptions saveOptions = new Aspose.Words.Saving.HtmlSaveOptions();
saveOptions.CssStyleSheetType = Aspose.Words.Saving.CssStyleSheetType.Embedded;
saveOptions.ExportImagesAsBase64 = true;
wrdf.Save(htmlOutput, saveOptions);
CL CV v188.zip (126.7 KB)

@agrawaltejas

We have tested the scenario using the latest version of Aspose.Words for .NET 20.2 and have not found the shared issue. So, please use Aspose.Words for .NET 20.2. We have attached the output HTML with this post for your kind reference. 20.2.zip (31.2 KB)

Hi Tahir,

i updated the DLLs using nuget to 20.2.0 but still I am getting the same HTML with improper characters. Could you please tell me that code snippet i provided above is correct for these characters or i have to add something else for this. Can you share the piece of code for converting Docx/Doc to Html with everything embedded in HTML

@agrawaltejas

Please use the following code example to achieve your requirement.

Document doc = new Document(MyDir + "CL CV v188.doc");
HtmlSaveOptions htmlSaveOptions = new HtmlSaveOptions();
htmlSaveOptions.ExportImagesAsBase64 = true;
htmlSaveOptions.ExportFontsAsBase64 = true;
htmlSaveOptions.ExportFontResources = true;
htmlSaveOptions.CssStyleSheetType = CssStyleSheetType.Embedded;
doc.Save(MyDir + "20.2.html", htmlSaveOptions);

Could you please ZIP and attach the output HTML generated by latest version of Aspose.Words 20.2? Please also share information about your specific culture, such as the name of the culture, language and country/region. We will investigate the issue and provide you more information on it.

We do not set any culture settings in our code. I tried to use code same as you provided above still same issue. CL Latest Html.zip (128.2 KB)

@agrawaltejas

Please make sure that you have installed the fonts used in your document at the machine where you are converting the document. Moreover, please try to convert your document at some other system and let us know if you still face the same issue.

Hi Tahir,

The code we have deployed on our development server and we have hosted that code as an webapi and we upload the document at any client system and pass that document to that webapi for conversion. Its an web based api so we are not sure that which fonts can be used by which user. Suppose, we install this font so it might happenned that if we get another issue with French document so we cannot install all fonts.

Let me know the solution for this. I think during conversion all fonts also should get copied. And when you tested at your end which font did you installed to get it working properly.

@agrawaltejas

Could you please open the attach HTML at your end and share the screenshot of it?
20.2.zip (128.3 KB)

Please execute the following code example at your end and share the output HTML.

Document doc = new Document(MyDir + "CL CV v188.doc");
HtmlSaveOptions htmlSaveOptions = new HtmlSaveOptions();
htmlSaveOptions.PrettyFormat = true;

doc.Save(MyDir + "20.2.html", htmlSaveOptions);

We tested the scenario at Windows 10 under English culture and faced no issue at our end. Please share your working environment for further testing. Thanks for your cooperation.

Hi Tahir,

It is working well in my Windows 10 machine, however, its not working in our dev server which is Windows Server 2016. Do you have any idea about this?

@agrawaltejas

Most probably, this issue is related to missing font on Windows Server 2016. Please install Arial and ‘Arial Narrow’ at Windows Server 2016 and generate the output document using Aspose.Words and let us know how it goes on your side. Hope this helps you.