Docx to pdf with exotic characters

Hi,

we are using aspose.words (latest version aspose-words-16.5.0-jdk16.jar) to convert a docx to pdf, experiencing the issue that exotic unicode characters are not converted correctly. We are using the standard PdfSaveOptions and tried several options without success.

I have attached the docx and converted pdf files.

Thanks for your help
Heiko Heimann

Hi Heiko,

Thanks for your inquiry. Please note that Aspose.Words mimics the same behavior as MS Word does. If you convert your document to Pdf using MS Word, you will get the same output. MS Word does not show the Unicode characters correctly for your document. Please check the attached image for detail.

Hi Tahir,

thanks for your reply. In fact, my installation of MS Word shows the Unicode characters correctly (see attachment) and also exports these correctly to PDF (see attachment, this PDF file has been exported using Word). I am using MS Office Word 2007.

Regards
Heiko Heimann

Hi Heiko,

Thanks for sharing the detail. MS Word 2007 shows the German characters correctly for your document. However, MS Word 2013 does not. We have logged this problem in our issue tracking system as WORDSNET-13849. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.
Hi Heiko,

Thanks for your patience. It is to inform you that our product team has completed the work on issue (WORDSNET-13849) and has come to a conclusion that this issue and the undesired behavior you're observing is actually not a bug in Aspose.Words. So, we have closed this issue as 'Not a Bug'.

The problem is related to the font fallback mechanism. The missing characters are from Ethiopic Unicode range. It is not included into the Aspose.Words fallback lists. Also it seems not included into the MS Word 2013 and 2016 fallback lists too. Aspose.Words currently relies on the latest MS Word behavior in font fallback implementation. Please change the font of the problematic text in input document to get the desired output.