@australian.dev.nerds There is no need to specify additional load or save options upon conversion form MHTML to PDF. You can use the following code to convert MHTML to PDF:
Document doc = new Document(@"in.mhtml");
Aspose.Words does not allocate any unmanaged resources upon loading document, so there is no need to dispose the Document object, it is collected by garbage collector once Document object is out of scope.
LoadOptions is a base class for HtmlLoadOptions, so all options available in LoadOptions class are also available in HtmlLoadOptions class. HtmlLoadOptions also provides properties which are specific for HTML-like formats.
Could you please attach your input MHTML document here for testing? We will check the issue and provide you more information. The problem might occur because the image is not available or Aspose.Words does not have access to it. You can implement IResourceLoadingCallback interface if you want to control how Aspose.Words loads external resources when importing a document.
Hello, wonder if no one ever asked to have this as a feature request? Words is not an end user app but a high-end high-priced SDK, flexibility is demanded.
One thing: if we use Words.FileFormatUtil.DetectFileFormat just to check the file format for other purposes (not for opening by Aspose Words) what should I do if default enum Auto is returned?! How to interpret the file type then?
@australian.dev.nerds I have logged a feature request in our defect tracking system as WORDSNET-25551. We will consider adding such feature.
Words.FileFormatUtil.DetectFileFormat method never returns LoadFormat.Auto. This enum value is used by Document constructor to let Aspose.Words know that it should auto detect load format (default behavior when no load options are passed).
@australian.dev.nerds Thank you for additional information. The image in your MHTML document is not accessible. It is not displayed when view the document in browser or when open document in MS Word. The problematic image URL is the following: https://docs.microsoft.com/answers/themes/minerva/images/qna-email-logo.png
If convert document to PDF using MS Word the image is also not loaded: aw.pdf (63.0 KB) ms.pdf (21.5 KB)
Thanks, not so sure, open in browser: Untitled.jpg (20.2 KB)
Anyway, kindly run this project sample to compare Words vs Cells conversion.
Yep, Cells’ output is a waste, but at least it downloads the image, word never get it, tested against many emails: WindowsApplication55.zip (19.6 KB)
Tiff / epub / xps - only saves the 1st page, possible to have all pages in a single file?
And the image download problem I had earlier still exists, I disabled my whole Windows Firewall, can’t find the problem, are you running my exact project code above? If yes, any idea what might be wrong?