Bidi causes reversed english text in Word but not Pdf

I’ve attached 2 documents. One Word and one Pdf.

Aspose.Words 13.3.0.0

The issue:
I am using a style to set the bidi properties for some text to be entered in the document. It is not known whether the text will be English or Arabic. When it is in Arabic everything is properly formatted in both Word and Pdf. When it is in English the text gets reversed in Word but not Pdf. However, if the text is mixed content it is properly formatted.

I am using the following code. Basically setting the bidi for the font and the the localid to arabic.

style.Font.Bidi = IsRtl(culture);
style.Font.LocaleIdBi = CultureInfo.GetCultureInfo(culture).LCID;

The style is being applied like so

paragraph.ParagraphFormat.Style = mystyle;
paragraph.ParagraphFormat.Bidi = mystyle.ParagraphFormat.Bidi; 
_builder.MoveTo(paragraph);
_builder.Write(mytext);

Do you know why it’s proper in pdf and mirrored in word. Is this a bug?

Hi David,

Thanks for your inquiry.

THe Bidi property specifies whether the contents of this run shall have right-to-left characteristics.

Regarding Bidi property, when on, shall not be used with strongly left-to-right text. Any behaviour under that condition is unspecified. This property, when off, shall not be used with strong right-to-left text. Any behavior under that condition is unspecified.

When the contents of this run are displayed, all characters shall be treated as complex script characters for formatting purposes. This means that BoldBi, ItalicBi, SizeBi and a corresponding font name will be used when rendering this run.

Also, when the contents of this run are displayed, this property acts as a right-to-left override for characters which are classified as “weak types” and “neutral types”.

Please let me know if I can be of any further assistance.

Best regards,

Thanks for the reply. While we further implemented bidi properties. We noticed this issue only occurs in Word 2013. The documents will view fine in Word 2010 but the same document if opened in Word 2013 will have reversed English text.

Hi David,

Thanks for the additional information. Yes, you’re right; Microsoft Word 2013 displays the English (LTR: Left to Right) text of your ‘index.doc’ incorrectly. However, when you open this document with OpenOffice, you will find that it renders the English text correctly as well. So, I don’t think there is a problem in Aspose.Words’ API. Moreover, could you please prepare a small console application that demonstrates this problem and attach it here for testing? I will investigate your problem further and provide you more information.

Best regards,

I’ve attached a sample console app. The output displays correctly in Word 2010 and incorrectly in Word 2013 with the output document generated.

Sorry to add another issue on top but we also noticed issues with hypenated words. The words get jumbled. Noted it displays properly in OpenOffice, improperly in Word 2010.

ie
input:This-is a test
result: is a test-This

or

input: start this-is a test فاهس هس ش فثسف end
result: end فاهس هس ش فثسف is a test-start this

Hi David,

Thanks for your inquiry.

David:
I’ve attached a sample console app. The output displays correctly in Word 2010 and incorrectly in Word 2013 with the output document generated.

After an initial test, this does look like a shortcoming of DocumentBuilder which probably has to automatically modify Bidi context when receiving RTL text and break into several runs properly attributed when receiving mixed text. We will provide a new functionality in DocumentBuilder that will enable it to automatically/transparently handle LTR and RTL mixed content so that you no longer need to explicitly set Font.Bidi property to true/false. Your request has been linked to the appropriate issue (WORDSNET-7209) and you will be notified as soon as it is supported.

Secondly, when only setting ParagraphFormat.Bidi property to true, Microsoft Word 2013 displays broken Arabic text while Microsoft Word 2010 displays the RTL text correctly. To address this problem, I have logged a separate issue in our bug tracking system as WORDSNET-8181. Your thread has also been linked to this issue and you will be notified as soon as it is resolved.

We apologize for any inconvenience.

Best regards,

Hi David,

Thanks for your inquiry.

*David:
Sorry to add another issue on top but we also noticed issues with hypenated words. The words get jumbled. Noted it displays properly in OpenOffice, improperly in Word 2010.

ie
input:This-is a test
result: is a test-This

or

input: start this-is a test فاهس هس ش فثسف end
result: end فاهس هس ش فثسف is a test-start this*

Could you please also attach your input/output document(s) and source code here for testing? I will investigate this issue on my side and provide you more information.

Best regards,

Thanks again for looking into our issues. I’ve attached the console app I made before with the additional code to test for the hypen issue.

We are also testing bidi on pdf and have noticed some issues. Would you prefer we open another posting regarding pdf or proceed in this posting?

Hi David,

Thanks for your inquiry. As mentioned previously, explicitly specifying Font.Bidi is not recommended when it comes to inserting mixed content using DocumentBuilder. I think, this is again related to WORDSNET-7209; we will inform you as soon as this issue is resolved. We apologize for your inconvenience.

Sure, you can create separate forum threads for a different topic to keep discussions separate for effective forum management.

Best regards,

Hi David,

Regarding WORDSNET-8181, I have attached a sample output document (out.doc), that is produced by using the following code snippet, here for your reference. When you open this document with Microsoft Word 2013, you will notice that it is unable to form Arabic Words instead it displays individual characters. However, Microsoft Word 2010/2007 and OpenOffice Writer render the Arabic text in out.doc correctly.

But, if you try creating a new document using Microsoft Word 2013 and insert the same RTL text like inside code example you will get exactly the same result like after inserting this text using Aspose.Words. This document looks fine in Microsoft Word 2007/2010 but upon viewing with Microsoft Word 2013 RTL text looks incorrect. I think this is Microsoft Word 2013’s issue (or it’s new behaviour). Please let us know your opinion on this. Thanks for your cooperation.

Paragraph paragraph;
Document _documentNode = new Aspose.Words.Document();
DocumentBuilder _builder = new DocumentBuilder(_documentNode);
paragraph = _builder.InsertParagraph();
paragraph.ParagraphFormat.Bidi = true;
_builder.Write("test");
paragraph = _builder.InsertParagraph();
_builder.Write("Start يبنتشبنشم end");
paragraph = _builder.InsertParagraph();
_builder.Write("فاهس هس ش فثسف end");
paragraph = _builder.InsertParagraph();
_builder.Write("start فاهس هس ش فثسف");
_documentNode.Save(@"C:\Temp\out.doc", SaveFormat.Doc);

Best regards,

Hi David,

In addition, I have attached a comparison screenshot here for your reference. We are not experts in Arabic language; so, considering the Arabic text only, please confirm if both Arabic text outputs shown in screenshot are acceptable to you? Thanks for your cooperation.

Best regards,