The direction of Arabic script is changed to the opposite after conversion from DOCX to HTML format

I have converted a *.docx document to a *.html format with Aspose.Words and the result document have a few inconsistencies.
The most affected pages:

  • 3: The Arabic script near ‘Balochi’ has changed its direction to the opposite.

I attached the document and a screenshot with a comparison of the script in MS Word and the browser (Firefox).

Code sample:

var doc = new Document(inPath);

var optHtml = new HtmlFixedSaveOptions()
{
	PageSet = new PageSet(new PageRange(0, int.MaxValue)),
	ExportEmbeddedCss = true,
	ExportEmbeddedFonts = true,
	ExportEmbeddedImages = true,
	ExportEmbeddedSvg = true
};
doc.Save(inPath + ".html", optHtml);

I use Aspose.Words.dll 25.3.0.0 and Microsoft® Word for Microsoft 365 MSO (Version 2501 Build 16.0.18429.20132) 64-bit.
test38.zip (74.8 KB)

@licenses

It seems that the issue you’re experiencing with the Arabic script direction changing after converting a DOCX document to HTML is likely due to the text direction settings not being properly configured in your original document.

To resolve this, you should ensure that the text direction for the paragraphs containing Arabic text is set to “Right-to-Left” in your DOCX file before performing the conversion. This can be done in Microsoft Word by selecting the relevant paragraphs and setting their text direction accordingly.

Here’s a brief outline of the steps you can take:

  1. Open your DOCX document in Microsoft Word.
  2. Select the paragraphs that contain Arabic text.
  3. Set the text direction to Right-to-Left:
    • Go to the “Home” tab.
    • In the “Paragraph” group, click on the small arrow in the bottom right corner to open the Paragraph dialog.
    • Under the “General” section, find the “Text direction” dropdown and select “Right-to-Left”.
  4. Save the document.

After ensuring that the text direction is set correctly, you can use your existing code to convert the document to HTML:

var doc = new Document(inPath);

var optHtml = new HtmlFixedSaveOptions()
{
    PageSet = new PageSet(new PageRange(0, int.MaxValue)),
    ExportEmbeddedCss = true,
    ExportEmbeddedFonts = true,
    ExportEmbeddedImages = true,
    ExportEmbeddedSvg = true
};
doc.Save(inPath + ".html", optHtml);

This should help maintain the correct direction of the Arabic script in the resulting HTML file. If you continue to experience issues, please double-check the text direction settings in your original document or consider reaching out for further assistance.

If you need more detailed guidance on handling text direction in Aspose.Words, you can refer to the documentation on document text direction settings.

Feel free to ask if you have any more questions!

@licenses It looks like an expected behavior. If you convert your document to HTML using MS Word, you will see the same behavior.
It looks like this occurs because text direction is improperly set in the source document. Arabic script in your document has LTR direction instead of RTL:

Yes, you are right about MS Word, though the mentioned settings don’t change anything for me.
The script is in the original direction when exported to *.pdf but in the opposite one for *.html and printing.

Thanks for your help. I guess, I will need to investigate MS Word behavior first.

1 Like