Eml- mht and Msg to Html conversion

Hi,

I am converting above mentioned file formats to html using Aspose.Words.dll. Please guide me how to remove all the hyper links from HTML file. and also let me know how to fix the below issue.

The HTML file generated from MHT file contains the "Subject:
Attachments:ATT00001.vnd.ms-officetheme" above the actual message. Please let me know the reason for this and also provide me the solution to avoid these extra line's.

Please find the sample file and output file from the attachment.

code i am using for conversion:

//load email message into an instance of MailMessage var message = Aspose.Email.Mail.MailMessage.Load("C:\\temp\\message.mht");
//create an instance of MemoryStream to store the intermediate data (mhtml) var stream = new System.IO.MemoryStream();
//convert the email message to MHTML and store in MemoryStream message.Save(stream, MailMessageSaveType.MHtmlFromat);
//create an instance of LoadOptions and set the LoadFormat to Mhtml var loadOptions = new Aspose.Words.LoadOptions(); loadOptions.LoadFormat = LoadFormat.Mhtml;
//create an instance of Document and load the MTHML from MemoryStream var document = new Aspose.Words.Document(stream, loadOptions);
//create an instance of HtmlSaveOptions and set the SaveFormat to Html var saveOptions = new Aspose.Words.Saving.HtmlSaveOptions(SaveFormat.Html);
//save the document to Html file document.Save("C:\\temp\\output.html", saveOptions);

Please find the below link for your reference.

http://www.aspose.com/community/forums/354291/eml-to-html-file-conversion/showthread.aspx#354291

Thanks,

Dhivya

Hi
Dhivya,


Thanks for your inquiry. First of all, please note that, with Aspose.Words you can save MHT/MHTML files to HTML format directly. Secondly, I was unable to reproduce the second part of your request i.e. "The HTML file generated from MHT file contains the “Subject: Attachments:” I used the following code snippet:

LoadOptions loadOptions = new LoadOptions();
loadOptions.LoadFormat = LoadFormat.Mhtml;
Document document = new Document(@“c:\test\in - Copy.mht”, loadOptions);

HtmlSaveOptions saveOptions = new HtmlSaveOptions(SaveFormat.Html);
document.Save(@“c:\test\output.html”, saveOptions);

Morover, I am afraid, I couldn’t find any hyperlinks to remove in the generated HTML document. Please clarify what do you mean by “how to remove all the hyper links from HTML file”? Also, I have attached the HTML file i.e. generated on my side here for your reference.

Please let me know if I can be of any further assistance.

Best Regards,
Hi Dhivya

Thanks for your inquiry.
Moreover you can remove Hyperlinks from Word Document object as following:
public static bool removeAllHyperlinks(Aspose.Words.Document wordDoc)
{
try {
ArrayList nodesToRemove = new ArrayList();

// get all FieldStart nodes in a document

NodeCollection fieldStarts = wordDoc.GetChildNodes(NodeType.FieldStart, true);

// iterate through document field starts collection

//FieldStart fieldStart = default(FieldStart);
foreach ( FieldStart fieldStart in fieldStarts) {
// if field start belongs to hyperlink

if (fieldStart.FieldType == Aspose.Words.Fields.FieldType.FieldHyperlink) {
// then mark for removal all nodes that are between field start and field end

Node aNode = fieldStart;

while (aNode.NodeType != NodeType.FieldEnd) {
nodesToRemove.Add(aNode);

aNode = aNode.NextSibling;
}
nodesToRemove.Add(aNode);
}
}
// remove all nodes that are part of hyperlink fields

//Node node = default(Node);

foreach (Node node in nodesToRemove)
{
node.Remove();
}

return True;

} catch (Exception ex) {
return false;
}
}

In case of any ambiguity, please let me know.

Hi,

The output file does not contains the images. Please let me know how to extract entire contents with image from mht files.

Thanks,

Dhivya

Hi Dhivya,


Thanks for you inquiry. While using the latest version of Aspose.Words i.e. 10.8.0, i managed to reproduce this issue on my side. We have logged this issue in our bug tracking system. You will be notified as soon as it is resolved.

Note:
Issue is red cross images and not about hyperlinks removal.

The issues you have found earlier (filed as WORDSNET-5723) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.