Thanks for your inquiry. I am afraid, I could not see any issue with your input/output documents, could you please clarify where the issue is? Please see the attached screenshot, your HTML, Word and PDF documents look all same.
Could you please confirm that you opened the attached docx in your MS Words and all appear ok (unlike the screenshot attached above)?
The screenshot shows that docx file (generated by aspose words) opened in MS Words 2010. You may judge it by reading [Compatibility Mode] on the title bar but I have no idea why Words shows that. The file was generated by aspose words and untouched.
If I misunderstood you, please clarify how I would open the file in correct encoding mode which will show all characters correctly.
My bad. I’ve just noticed that you opened the docx in Words 2013. However, as my screenshot shows, it doesn’t appear correctly in Words 2010. Could you please advise the solution in more details?
Thanks for your inquiry. It seems to be a problem with Aspose.Words generated output DOCX. I have logged this issue in our bug tracking system. The ID of this issue is WORDSNET-10430. Your thread has also been linked to this issue and you will be notified as soon as it is resolved. Sorry for the inconvenience.
Secondly, yes, MS Word 2010 does not display first two characters correctly but MS Word 2013 has no such problem. I think, as a workaround, you may save your document to first RTF format and then re-save it to DOCX format using Aspose.Words. MS Word 2010 will then open final DOCX correctly.
I can confirm that issue doesn’t appear in RTF output. However, we cannot use the proposed workaround as there are some manipulations we have before outputing to docx.
I also noticed you have raised issue WORDSNET-10430 which is for .NET instead of Java where we have problem. Please verify.
Thanks for your inquiry. Please note that the latest version of Aspose.Words for Java is completely auto-ported from .NET, i.e. we do not write code for Aspose.Words for Java; it is generated out automatically from C# code of Aspose.Words for .NET. In your case, the issue which was logged with WORDSNET prefix, would be auto resolved for Java variant of Aspose.Words. Your problem will be fixed as soon as the linked issue is resolved.
Hi Awais,
Even though the raised bug is not yet fixed we have raised the same issue with the other special characters and now the issue along with word generation we have it in PDF as well.Please have alook at the attached documents.
As well the solution proposed to save it in RTF doesnt work in this case.
Thanks for your inquiry. But the problem can be observed even when you view worddocument.docx with Microsoft Word 2013 (please see attached screenshot). However, you may use the following code to fix this issue:
Document doc = new Document(MyDir + "worddocument.docx");
MemoryStream rtfStream = new MemoryStream();
doc.Save(rtfStream, SaveFormat.Rtf);
Document docx = new Document(rtfStream);
docx.Save(MyDir + "out.pdf");