I’m trying to convert a word document to text, and getting strange symbols which would not get if I were converting using Word automation.
Here is the relevant part of my code:
TxtSaveOptions txtSaveOptions = new TxtSaveOptions();
txtSaveOptions.Encoding = Encoding.Default;
doc.Save(firstStream, txtSaveOptions);
I’ve attached an example files with the symbols I get in Aspose VS word, you can see the symbol in a hex editor, or any compare tool. I use Beyond Compare.
Here are some more strange symbols Aspose would give me (in red):
I have to somehow avoid getting these symbols. and since I don’t know how many and which of them I can get, I don’t want to solve this issue replacing them.
Hello
Thanks for your inquiry. Could you please attach your input document here for testing? I will check the problem on my side and provide you more information.
and I got an Identical output to which word automation gives me.
I guess the other problem I have with the BOM and other special characters would be a different issue.
Do you have any idea how to avoid this symbols? I still get them after the fix.
Hi
Thanks for your request. Maybe in your case, you should simply create your own to TXT converter as described here: https://reference.aspose.com/words/net/aspose.words/documentvisitor/
This approach will allow you to control how the document is converted to TXT.
Best regards,