Preserve Text Direction LTR or RTL of Hebrew English Mixed Content & Convert Save As DOCX to PDF using C# .NET - OpenXml

Hi

I have the EXACT problem as stated in the forum in July 2016.

Your answer then was that it wasn’t solved .

Please let me know what the solution is.

I am copying parts of the issue as stated in July 2016:

In our .NET application, we create a WORD document using OpenXML technology, and then using Aspose.Words to convert the doc byte array to PDF, but in the output PDF the tables are corrupted.

Attached an example of the output PDF (1.pdf).

If we save the source WORD in the file system ,then open it in MS-WORD and the convert it to PDF - the output PDF is OK (attached 2.pdf), but this is not possible in our case.

it’s important that you don’t open the original word with MS Word, because then it fixes something and the problem does not occur

thank you

David

@davidaspose1,

Please ZIP and attach the following resources here for testing:

  • Your simplified input Word document
  • Aspose.Words for .NET 20.1 generated output PDF file showing the undesired behavior
  • Your expected PDF file showing the desired output. You can create this document by using MS Word and attach it here for our reference

As soon as you get these pieces of information ready, we will start investigation into your scenario and provide you more information. Thanks for your cooperation.

Hi Ariela

I have a feeling I have the same problem as you - Word in OPenXml and then the table comes out backwards in Hebrew in pdf.
Have you found a solution ?
If so please ley me know
Thank you

David

dlerner@mof.gov.il

Hiexample openxml - pdf.zip (93.6 KB)
example openxml - word.zip (68.0 KB)

Hi

I’ve uploaded the word file and the pdf files.
The word file was created via openxml.
The pdf by saving the byte file via aspose

The problem is that the text which contains Hebrew and English.
Pay attention to the line which contains “IBoxx $ Liquid”
On the word document it appears to the left - which is correct
while on the pdf it appears incorrectly to the right.

If we save the source WORD in the file system ,then open it in MS-WORD and then convert it to PDF - the output PDF is OK but this is not possible in our case.

it’s important that you don’t open and save the original word with MS Word, because then it fixes something and the problem does not occur.

Thanks

David

P.S. This exact problem appeard in your fourm in July 2016 - sent by Dvir Peretz - but you did not supply a solution…

1 Like

@davidaspose1,

Please also ZIP and attach the following resources here for further testing:

  • ’FrankRuehl’ and ‘3 of 9 Barcode’ font files
  • Convert your Document to PDF format by using MS Word on your end and attach the MS Word generated PDF file here for our reference
  • A comparison screenshot highlighting the problematic areas in this 20.1.pdf (91.3 KB) w.r.t MS Word generated PDF file

Thanks for your cooperation.

Hi

I’m attaching the following:

FrankRuehl font
David fontfrankruehl.zip (22.4 KB)
david.zip (59.2 KB)
msword generated pdf (using “save as” pdf file)example msword.pdf (442.3 KB)

comparison screenshot pdfs.PNG (113.5 KB)

Please Read Carefully:
In my opinion the problem is not related to fonts.

If I create my word document from OpenXml and then use the same byte file to save as pdf (with Aspose) then the results are not good.

If I create my word document from OpenXml and then open it and save it (save as) and then save as pdf (with Aspose) then the results are good.

What I’ve found is that the Word document created from openXml is DIFFERENT than that which is created by opening the same Word document created from openXml and then saving it (saveas)

Therefore the problem lies in the Word document created by the openXML which is different than that created ‘normally’ and that aspose has issues with it.

Try this - open the word document I attached - which was created with OpenXML and do “saveas” to a different name - you’ll see that THE FILE SIZES ARE DIFFERENT, meaning that they are save differently - and apparently ASPOSE can handle the ‘regular’ word document but not that created by OPENXML

Thanks

David

I

@davidaspose1,

Thanks for the details. We are working on your query and will get back to you soon.

@davidaspose1,

We have logged this problem in our issue tracking system. Your ticket number is WORDSNET-19915. We will further look into the details of this problem and will keep you updated on the status of the linked issue. We apologize for your inconvenience.

@davidaspose1,

Regarding WORDSNET-19915, we need to further understand how did you produce “example msword.pdf” on your end? Did you just open “example openxml - word.docx” document with MS Word 2016 on your end and then used “Save As” command to save to PDF?

Also, please open “example openxml - word.docx” with MS Word and then “Save As” to XPS format. Please ZIP this XPS file and attach it here for further testing.

Also, please provide MS Word screenshot with opened “example openxml - word.docx” in it.

Thanks for your cooperation.

Hi

The following is EXACTLY the steps performed:

  1. An msWord document - .docx - was created using OPENXML - attaching OpenXml.zip (70.7 KB)

  2. I then in my c# applcation opened the word document OPENXML and saved it as a PDF using aspose - attaching OpenXmlPDF.zip (94.5 KB)
    The results WERE NOT GOOD - pay attention to the table in the body of the documents where the WORD document has the words IBoxx $ Liquid on the left hand side of the line while the PDF document shows the words IBoxx $ Liquid to the right.

HOWEVER

  1. I take the EXACT same word document I created via OPENXML and open it in Microsoft Word 2016 and then do SAVEAS - attaching SaveAs.zip (65.0 KB)

  2. I then in my c# application repeat the exact same procedure - open the Word file which was created by SAVEAS The RESULTS ARE AS DESIRED - the words " IBoxx $ Liquid " appear on the left side of both the WORD and PDF documents - attachedSaveAsPDF.zip (87.0 KB)

I am also attaching as you requestd the word file created by OPENXML and save as an xps file - attached - OpenXmlXPS.zip (580.1 KB)

The problem appears to be in that there’s a difference in how ASPOSE converts a word document created by OPENXML to a pdf file and to the same document saved in MS-WORD (created by doing SAVEAS on the openxml word documet)

Thank You

David

@davidaspose1,

Thanks for the additional information. We have logged these details in our issue tracking system and will keep you posted on further updates.

@davidaspose1,

Regarding WORDSNET-19915, it is to update you that we can observe the word order in the mixed Hebrew/English lines as it was described by you in MS Word 2013. The word order for these lines is different in MS Word 2016 and MS Word 2019. The latest Aspose.Words version orders the words in those lines the same as MS Word 2016 and MS Word 2019 do.

Attached are PDF files created from “example openxml - word.docx” by MS Word 2013, MS Word 2016 and MS Word 2019, screenshots of the document opened in MS Words’ editors and screenshots of MS Words’ “About” dialog boxes.

Please also provide following resources here for further testing.

  1. The screenshot of “example openxml - word.docx” opened in your MS Word,
  2. The screenshot of your MS Word’s “About” dialog box.

Thanks for your cooperation.