Free Support Forum - aspose.com

Accented characters stripped

i downloaded a demo of aspose.words -- the latest. using it to convert doc->html causes accented characters to be stripped from the resulting html.

any ideas?


This message was posted using Aspose.Live 2 Forum

Hi<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your inquiry. Could you please provide me more information about your problem and attach sample document for investigating?

Best regards.

Yes. I have a web application (a CMS). I allow users to upload Word docs (http), but convert them upon upload to html so they may be nicely referenced in iframes etc.. Accented characters are stripped by Aspose.Words.Document.Save.

I have the same problem on the same machine with Word automation (using SaveAs, web page filtered). If, however, I use Word interactively on the same machine, I don't have the problem. Nor do I have the problem with any of the methods I've explored (Aspose.Words or Word automation) on other machines. In other words, the problem is almost certainly environmental, but what should I be looking for? I need help.

I'd like to get Aspose for the stability over Word automation but that is something only I care about because I did a reasonably good job coding my CMS to utilize Word automation a long while ago. Getting funding to purchase Aspose is resting heavily on solving the accented character stripping problem which I thought was certainly a peculiarity of Word ... and would be solved with Aspose -- not so.

Attached is a Word document with the word Test in it having the "e" accented. It is the file I have been testing Aspose.Words.Document.Save method with.

Thank you for your help!

Hi

Thanks fro additional information. I can’t reproduce this issue on my side. Please see attached HTML (output).

Best regards.

Yes. That doesn't come as a surprise to me. As I mentioned, I have environments here where I can get it to work too. The problem is that it fails in my production environment.

This is not a report of a Aspose bug, rather it is a request for pre-sales support. No one here is going to find funding for something that doesn't work out the gate (even if that is my problem). I am asking for help.

Have you ever heard of this before? If so, do you know how it was resolved?

Hi

Maybe this is encoding issue. Aspose.Words saves HTML using UTF8 encoding by default. Maybe this article will be useful for you.

http://aspnetresources.com/blog/unicode_in_vsnet.aspx

Best regards.

Thanks for that Alexey. The web.config:globalization:fileEncoding is already set to UTF-8 – that will not resolve it but I bet it is going to be something like that. I will keep looking. If you see anything else, I appreciate it if you pass it along.

Thanks for your help Alexey.

This is now resolved. I am embarassed ... I had not disengaged the Word automation conversion when I set things up to test Aspose on the production server. Aspose did the job ... Word automation overwrote that with its own faulty result.

I knew it was going to be my problem. But ... I didn't know it would be so lame. I apologize.