Write Unicode Text of any Language (Punjabi Gujarati) & UTF-8 Characters in DOCX & Convert to PDF using Java API

@srinivasc,

We have logged your requirement in our issue tracking system. Your ticket number is WORDSNET-17590. We will further look into the details of this requirement and will keep you updated on the status of the linked issue.

@srinivasc,

Regarding WORDSNET-17590, what we understand is that you take some Unicode text (which may be in any language), insert it into Aspose.Words’ DOM and then save to DOCX/PDF. If you insert all text with the single default font then this case is handled by Font Fallback mechanism. The text is stored in DOM and saved to DOCX (and other flow formats) as is. Font fallback is performed when opening it with application (MS Word or some other). When rendering to PDF (and other fixed-page formats), Aspose.Words performs font fallback by itself. So, generally you should not perform additional actions.

You also seem to complain that generated DOCX file is not opened properly. We assume that it is opened in non-MS Word app because MS Word 2016 handles all your documents well. If you cannot rely on DOCX opening application then alternative will be (as suggested here) use some third-party library to detect Unicode text language and set the font in DOM accordingly. Also as another alternative we could try to introduce new feature to change the fonts in DOM according to our font fallback rules.

As for the this specific issue with saving to PDF, we have updated our default fallback settings to fit the MS Word behavior. All text is rendered fine except the text for “Segoe UI Symbol” which is not rendered by MS Word 2016 either.

We will keep you posted on further updates and let you know when this issue will be resolved.

The issues you have found earlier (filed as WORDSNET-17590) have been fixed in this Aspose.Words for .NET 18.12 update and this Aspose.Words for Java 18.12 update.