Hi,
We are using Aspose.Words 17.7 for .NET for converting input file types to textual PDF or image files. We got an issue with unicode symbols in text files. We convert input TXT file to PDF by the such code:
Document wordDocument = new Document(contentStream);
wordDocument.Save(outputStream, SaveFormat.Pdf);
If we make it on the computer with Windows 10, without installed Microsoft Office we get the output PDF with unicode symbols are changed to empty boxes. However, this may be because the corresponding fonts are not installed on the computer. OK, we install Microsoft Office 2013 and repeat the experiment. Now we get the PDF file in which the Georgian language is correctly displayed, but the Ethiopian language and Runes are still replaced by empty boxes. But if we open input document in Microsoft Word in same environment the all unicode symbols are dysplayed correctly, and MS Word makes correct PDF.
Thus, Aspose.Words can’t correctly determine all Unicode characters.
In attached ZIP:
UTF8.txt - input txt file;
UTF8ByAspose.pdf - PDF file generated by Aspose.Wodrs without Microsoft Office being installed;
UTF8ByAsposeWithOffice.pdf - PDF file generated by Aspose.Wodrs after Microsof Office installation;
UTF8ByWord.pdf - PDF file saved by Microsoft Word.
Thanks,
Roman
Unicode.zip (417.6 KB)