PDf Conversion of file with TXT extension, but inside EML

Hello,

we have a few questions:

Baseline:
We have an email attachment (See Attachment) which extension is .TXT. Therefore, we take for the PDF conversion Aspose.Words.

The problem is that the attachment is actually an EML file.

1.) When we convert this file with Aspose.Words is not, as expected, the pure text in PDF, but similar to a mail, only without header info.
-> Why isn’t there a pure text in the PDF?
-> How does Aspose decide whether it comes out as pure text or re-formatted in PDF?
-> If Aspose already detects that it is an EML file, why are the headers removed?
-> How can we sure that when converting a “pure” TXT file, even pure text in PDF will come out without any format adjustments?

2.) Is there a way of Aspose to analyze a TXT to determine that it is an EML file and then use Aspose.EMail to convert?

EMLInside.zip (84.9 KB)

Thank you and greetings,
Andy

@AStelzner Extension does not matter to Aspose.Words, because it analyzes the file and detects it’s real file format. In your case original file format is detected as MHTML. You can use FileFormatUtil to detect original document format. Please see our documentation to learn more about detecting file format.

In your case if you run the following simple code, you will get MHTML:

FileFormatInfo info = FileFormatUtil.DetectFileFormat(@"C:\Temp\EMLInside.txt");
Console.WriteLine(info.LoadFormat);

Thanks!

Following questions are not answered :slight_smile:
-> If Aspose already detects that it is an EML file, why are the headers removed in the PDF?
-> How can we sure that when converting a “pure” TXT file, even pure text in PDF will come out without any format adjustments? Aspose.Cells to PDF Converter? What is the “best practice” for plain text?

@AStelzner

Aspos.Words simply loads the document as MHTML. If you change extension of your file to .mhtml and open it in the browser, you will see exactly the same content.

You can explicitly specify load format, but in this case in the output document you will see the internal representation of your MHTML file. Like if you open it in notepad. For example the following code will produce this output: out.pdf (179.7 KB)

LoadOptions opt = new LoadOptions();
opt.LoadFormat = LoadFormat.Text;
Document doc = new Document(@"C:\Temp\EMLInside.txt", opt);
doc.Save(@"C:\Temp\out.pdf");

What tool to use depends on your original load format.