Bad txt conversion

hi, converting a txt with char like “à é è” etc creates a bad images and pdf.
See attached examples and screenshot of resulting images.
Archive.zip (11.9 KB)

this is our code

m_oDocument = new Aspose.Words.Document(_oInputStream);
((Aspose.Words.Document)m_oDocument).Save(_oOutputStream, OptionsW);

@BooleServer,

Thanks for your inquiry. We have tested the scenario using latest version of Aspose.Words for .NET 17.10 and have not found the shared issue. Please use Aspose.Words for .NET 17.10. We have attached the output PDF with this post for your kind reference. 17.10.pdf (6.7 KB)

We use aspose.words 17.10 too of course.
It’s a problem with ANSI encode of txt (we imagine) because if you create a new txt document and write this exact chars: ààèè
as example, you can use some accented letters without spaces in between, aspose creates a bad converted png, BUT if you insert spaces in the original txt, the conversion is good
and more, if you save the txt with unicode instead of ansi, again the conversion is good.
The problem is with ansi (and other we can’t say creating a txt with mac-os) and accented letters without spaces in between.

@BooleServer,

Thanks for your inquiry. Aspose.Words auto detects text encoding of text file and supports ANSI encoding. Could you please share the console application (source code without compilation error) that helps us to reproduce your problem on our end? Please also share the problematic output image. We will investigate the issue on our side and provide you more information.

i attach again the original txt and png and pdf generated by aspose…
New Text Document.txt 0 export.zip (7.2 KB)

the code is very simple (we don’t use console application but web application):

m_oDocument = new Aspose.Words.Document(_oInputStream); [txt file passed as stream]
OptionsW = new Aspose.Words.Saving.ImageSaveOptions(Aspose.Words.SaveFormat.Png);
((Aspose.Words.Document)m_oDocument).Save(_oOutputStream, OptionsW);

you can see that if you insert some spaces in between the letters on the original txt files, the conversion is good.

@BooleServer,

Thanks for sharing the detail. We have not found the shared issue at Windows operating system. We are setting up the MAC operating system at our end. As soon as everything is setup, we will test your issue and will post the results here for your kind reference.

@BooleServer,

Thanks for your patience. We have tested the scenario at MAC operating system and have not found the shared issue. Please check the attached output image.
17.10.zip (553 Bytes)

sorry but we are opening your png and we see (in mac and in windows) as bad as our files
see the attached screenshot (viewing your txt as you can see)
So why you tell you see it right? how is it possible?
Screen Shot 2017-10-24 at 09.59.44.png (80.7 KB)

@BooleServer,

Thanks for your inquiry. We have noticed that the Chinese text is imported incorrectly into Aspose.Words’ DOM. We have logged this problem in our issue tracking system as WORDSNET-16047. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

chinese text?
sorry but let’s clarify our problem: we create a txt file in windows or in macOs and write into the txt file this:
èèààù

thats it, no chinese.
the “chinese” chars appears when aspose.words converts the “èèààù” txt file

did we mean it? is that you intend in WORDSNET-16047?

thanks a lot

@BooleServer,

Thanks for your inquiry. Please note that Aspose.Words mimics the same behavior as MS Word does. MS Word detects “ààèù” as Japanese text with Shift-JIS encoding. Aspose.Words auto detects this encoding. Please open your input txt file in MS Word to check its behavior. See the attached input image. input.png (15.0 KB)

The output generated by Aspose.Words differ from input. Please check the screenshot of output PNG. output.png (17.9 KB)

If you want the output as “ààèù” in PNG file, please set the LoadOptions.Encoding as Encoding.Default.

LoadOptions options = new LoadOptions();
options.Encoding = Encoding.Default; 

var document = new Document(MyDir + "in.txt", options);
document.Save(MyDir + "17.10.png");

terrific.

WORDSNET-16047 will avoid using LoadOptions.Encoding as Encoding.Default?

LoadOptions.Encoding as Encoding.Default will modify some other cases using word files? our software in intended to be generic and convert all txtx/docx files…

thank you

@BooleServer,

Thanks for your inquiry.

Yes, your understanding is correct. The issue (WORDSNET-16047) is related to import of Japanese text into Aspose.Words’ DOM.

There is no need to use LoadOptions.Encoding. Aspose.Words detects encoding of text file automatically.

so we should wait for WORDSNET-16047 resolution to fix our “àààèèò” problem, is that right?

@BooleServer,

Thanks for your inquiry. Yes. We would like to share with you that the output of “àààèèò” will be 珥琲頸 after the fix of WORDSNET-16047.

Please share us some news about WORDSNET-16047.
We didn’t receive any notice from October 2017!

@BooleServer,

Please accept my apologies for your inconvenience.

Please use the following code example with latest version of Aspose.Words for .NET 18.1 to get the correct output. Aspose.Words auto detects the encoding of text file.

var document = new Document(MyDir + "in.txt");
document.Save(MyDir + "18.1.png");

We have attached the output PNG with this post for your kind reference. 18.1.png (1.7 KB)

tried with 18.1 but nothing changes

moreover the output png you have linked is bad encoded: don’t you see?
tried to download on windows and mac.

please share us some news.

@BooleServer,

Thanks for your inquiry. The issue WORDSNET-16047 was logged to detect the encoding of your input document as Shift-JIS. After analyzing this issue, our product team closed this issue as ‘Not a Bug’.

Please note that the Aspose.Words’ auto-detection encoding returns Japanese and this is exactly the same result as MS Word does. So, in your case, you need set the LoadOptions.Encoding explicitly to get the desired output.

Please let us know if you have any more queries.