DOCX to PDF conversion issue with Arabic text using Java

Hello,


We are using Aspose version 11.3 (Java platform) to process & convert Word documents to PDF; it works fine for Latin languages but we are facing some critical issues with Arabic document conversion; follows are two blocking ones:


1) Tables (in RTL Arabic documents) change their direction in the output PDF regardless whether this table is setup to be RTL or LTR in the original docx document; I couldn’t find any workaround to overcome this.

2) This second issue needs your attention please; it took me quite a while to understand the behavior of what is happening; in short: whenever a bookmark surrounds an Arabic text/paragraph (which is set to RTL) Aspose fails to save the document but not in all cases; only in case the data to be replaced with contains non-alphanumeric characters. It is easier seen, I have created some samples (attached) with comments that explain each case. The test-one_ar.docx is the document we are trying to process & convert to pdf; the test-one.json contains the data which should be placed in the Word document before converting it to pdf. Please note that the json file also contain some very useful comments.


My development machine has Aspose Word 11.3 for Java / Windows 7 x64 / MS Office 2010 / my call stack shows this:

Caused by: java.lang.NullPointerException
, requestUrl=http://localhost:8080/doc-gen-service/service/templates/test}
com.consol.docgen.service.aspose.DocxProcessingException: Can not save to DocType{contentType='application/pdf', fileExtension='pdf'} stream org.eclipse.jetty.server.AbstractHttpConnection$Output@1661e67c
at com.consol.docgen.service.aspose.AsposeDocxTemplate.save(AsposeDocxTemplate.java:569)
at com.consol.docgen.service.aspose.AsposeDocumentService.generate(AsposeDocumentService.java:34)


I hope to hear from you soon about these issues; kindly look the comments on both attachments; it took me some good amount of time to find out what I wrote there and I’m sure it would be of a big help to you to have a look at it.


Many thanks & Regards,
Alaa Tadmori

Hi Alaa Tadmori,


Thanks for your patience. We are working over your query and will get back to you as soon as possible.

Hi Alaa Tadmori,


Thanks for your request. Unfortunately, Aspose.Words does not support RTL during Rendering and converting to PDF. Your request has been linked to the appropriate issue as WORDSNET-6253. We will let you know once it is resolved. Since it is very complex issue, I cannot promise you a fast fix.

In your second case, could you please share your code where you are trying to replace data with non-alphanumeric characters? We will take a closer look and guide you accordingly.

Hi Alaa,

Regarding issue WORDSNET-6358, we have received response from your development team and like to share with you that our development team has completed the analysis of this issue. The WORDSNET-6358 has same root cause as issue number 2 described by you in your first post. We will update you via this forum thread once these issues are resolved.

We really appreciate your patience.

Hello & Thanks for the update.


I am happy that these issues are worked on & hope to see fixes for them in the near future.
We actually kept trying to find workarounds for these issues but in vain as our whole application is crashing when trying to process a docx document as described earlier in my posts.

I wonder if there is a way we could track the work that is being done on this issue (and others)?! I also wonder about the name of the issue “WORDSNET” as our product is Word for Java not Word for .Net!

Looking forward to hearing from you soon.

Thanks & Regards,
Alaa Tadmori

Hi Alaa,

Please accept my apologies for late response. The shared issue exist in Aspose.Words for .NET and Java, That is why The issue name is like “WORDNET”.

I have verified the status of shared issue form our issue tracking system like to share with you that this issue is under development phase. You will be updated via this forum thread once this issue is resolved.

We are really appreciate your patience and apologies for your inconvenience.

The issues you have found earlier (filed as WORDSNET-6358) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Thanks for your effort. The previously reported bug is indeed fixed in this release.


I have another bug for you though : )
Our Arabic documents still fail to convert to PDF unfortunately, so far the bug seems easier to understand and hopefully to solve than the previous one; here is a description of it:

First the stack trace shows this:

Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -2
at java.lang.String.substring(String.java:1937)
at com.aspose.words.alx.e(Unknown Source)
at com.aspose.words.aly.aqZ(Unknown Source)
at com.aspose.words.j.AT(Unknown Source)
at com.aspose.words.j.AS(Unknown Source)
at com.aspose.words.fv.MJ(Unknown Source)
at com.aspose.words.fv.moveNext(Unknown Source)
at com.aspose.words.fv.MC(Unknown Source)
at com.aspose.words.xo.a(Unknown Source)
at com.aspose.words.Document.updatePageLayout(Unknown Source)
at com.aspose.words.Document.aV(Unknown Source)
at com.aspose.words.Document.Lo(Unknown Source)
at com.aspose.words.Document.getPageCount(Unknown Source)
at com.aspose.words.qt.a(Unknown Source)
at com.aspose.words.Document.a(Unknown Source)
at com.aspose.words.Document.a(Unknown Source)
at com.aspose.words.Document.i(Unknown Source)
at com.aspose.words.Document.save(Unknown Source)
at com.consol.docgen.service.aspose.AsposeDocxTemplate.save(AsposeDocxTemplate.java:567)
… 46 more

Because when a document fails Aspose reports a generic error message I tried to diagnose the document to see what exactly is failing hoping to figure out the problem, to cut a long story short, there are certain styles or maybe formatting on certain words that are causing the error… I attached a Word document with only one word in it that is failing to convert to PDF… If you copy paste this text to notepad (to remove styles & formatting) then copied this word back to Word everything works fine…

We are using Aspose.Words 11.4 for Java; my development machine is Windows 7 x64 with Office 2010.

Looking forward to hearing from you soon about this.
Thanks again for your support.

Alaa Tadmori,
Software Developer
ConSol* MENA LTD.

Hi Alaa,

We are really sorry for your inconvenience.

I have managed to reproduce the same issue at my end. I have logged this issue in our issue tracking system and you will be notified via this forum thread once this issue is resolved.

We really appreciate your patience and apologies for your inconvenience.

Hello Tahir & Thanks for your care.

I have got another issue that I found out causing Arabic documents to fail to convert to PDF; any Arabic text inside of a shape causes the whole document to fail to convert to PDF (regardless whether the text is formatted or not).
The call stack shows this:

Caused by: java.lang.NegativeArraySizeException
at com.aspose.words.aly.(Unknown Source)
at com.aspose.words.j.p(Unknown Source)
at com.aspose.words.j.AS(Unknown Source)
at com.aspose.words.fv.MJ(Unknown Source)
at com.aspose.words.fv.moveNext(Unknown Source)
at com.aspose.words.fv.MC(Unknown Source)
at com.aspose.words.xo.a(Unknown Source)
at com.aspose.words.Document.updatePageLayout(Unknown Source)
at com.aspose.words.Document.aV(Unknown Source)
at com.aspose.words.Document.Lo(Unknown Source)
at com.aspose.words.Document.getPageCount(Unknown Source)
at com.aspose.words.qt.a(Unknown Source)
at com.aspose.words.Document.a(Unknown Source)
at com.aspose.words.Document.a(Unknown Source)
at com.aspose.words.Document.i(Unknown Source)
at com.aspose.words.Document.save(Unknown Source)
at com.consol.docgen.service.aspose.AsposeDocxTemplate.save(AsposeDocxTemplate.java:567)
... 46 more
A sample document is attached as well. System info is the same as in my previous posts.

Hope to hear from you soon about this issue as well.

Thanks & Regards,
Alaa Tadmori, Software Developer
ConSol MENA LTD

Hi Alaa,

Thanks for your query. I have managed to reproduce the same issue at my end. I have logged this issue in our issue tracking system. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-6253) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

The issues you have found earlier (filed as WORDSNET-6476) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Hi Tahir,


Thanks for the steady progress you are making to solve our reported issues; indeed one of them is solved and I verfied that on my end but the other one is actually not (issue of the post 06-11-2012, 5:31 AM / some formatted text fail to convert to pdf); still if you try with the same attached file of that same post you will find that the docx file is still failing to convert to pdf.

We were excited to download the July release hopping to get these bugs fixed; should we wait till August to get this one fixed now or this could possibly provided as a hotfix?

I appreciate your cooperation. Many thanks!

Alaa Tadmori,
Software Developer
ConSol* MENA LTD.

Hi Alaa,

I have successfully converted the “shape-test.docx” to Pdf by using latest version of Aspose.Word for Java.

However, The issue related with document “styles-causing-errors-sample.docx” has not resolved yet. This issue had been logged with ID WORDSNET-6465. The analysis of this issue has been completed but we can not provide you any ETA at the moment. You will be updated via this forum thread once this is resolved.

We really appreciate your patience.

Dear Tahir,

Thanks for clarification. I actually thought that "styles-causing-errors-sample.docx" is solved because that time when you logged it in your system you didn't give me its reference number as you usually do so when you told me later that "WORDSNET-6476" & "WORDSNET-6253" were fixed I thought those should be the "shape-test.docx" & "styles-causing-errors-sample.docx" as these were the last two issues we were discussing..

However it is all clear now.. I confirm that "shape-test.docx" is solved & "WORDSNET-6253" is also solved and we remain now with the issue of "styles-causing-errors-sample.docx" which we are hoping to get a fix for before in the near coming days. I am looking forward to hearing from you soon about it.

Thanks & Best regards,
Alaa Tadmori,
Software Developer
ConSol* MENA LTD

Hi Alaa,

Thanks for your feedback. We will inform you via this forum thread once this issue (WORDSNET-6465) is resolved.

We really appreciate your patience.

The issues you have found earlier (filed as WORDSNET-6465) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.
I confirm that the issue (WORDSNET-6465) is now resolved with this version (Aspose 11.6).

Many thanks & Regards,
Alaa Tadmori,
Software Developer
ConSol* MENA LTD

Hi Alaa,


Thanks for your feedback. It is perfect that you managed to resolve the problem with the new version of Aspose.Words. If we can help you with anything else, please feel free to ask.

Best Regards,