Long delay creating a "Preview" PDF off a .msg file

Hello Aspose,

We wrote an Outlook Add-in that will allow people save emails into our System.

When the user chooses to save an email into our system we present a dialog to the user in which they will see a “Preview” of the email. This normally works just fine, but recently we’ve been seeing emails for which this process takes a very long time.

As luck would have it, this would be happening for emails internal to our company, like “messages from our CEO” and the likes. So I was able to obtain a sample of such an email without any sensitive data, and I will attach it to this topic.

In order to obtain the “Preview” PDF we need to use two of your products. Aspose.Email and Aspose.Words. We would use Aspose.Email to create a HTML stream to represent the email and then Aspose.Words to convert that to PDF.

These are the steps we execute (in VB code) up to the point where the long delay occurs:

Dim oEml As Aspose.Email.MailMessage = Nothing
Dim oSaveOptions As Aspose.Email.HtmlSaveOptions = Nothing
Dim oStream As IO.MemoryStream = Nothing
Dim oWords As Aspose.Words.Document = Nothing
Dim oWordsOptions As Aspose.Words.Loading.LoadOptions = Nothing

oEml = Aspose.Email.MailMessage.Load(sFile)
oEml.TimeZoneOffset = TimeZone.CurrentTimeZone.GetUtcOffset(oEml.Date.ToLocalTime)
oSaveOptions = New Aspose.Email.HtmlSaveOptions
oSaveOptions.HtmlFormatOptions = Aspose.Email.HtmlFormatOptions.WriteCompleteEmailAddress Or Aspose.Email.HtmlFormatOptions.WriteHeader Or Aspose.Email.HtmlFormatOptions.DisplayAsOutlook
oSaveOptions.MailMessageSaveType = Aspose.Email.MailMessageSaveType.HtmlFormat
oStream = New IO.MemoryStream
oEml.Save(stream:=oStream, options:=oSaveOptions)
oWordsOptions = New Aspose.Words.Loading.LoadOptions
oWordsOptions.LoadFormat = Aspose.Words.LoadFormat.Html
oWords = New Aspose.Words.Document(oStream, oWordsOptions)

It’s that last statement that causes a LONG delay.
Email template test.zip (315.9 KB)

See the attached file for a sample .msg file that causes this problem.

By the way, after loading the HTML into Aspose.Words we simply save it as a PDF

oWords.Save(fileName:=sToFileName, saveFormat:=Aspose.Words.SaveFormat.Pdf)

It would be great if you could find out why these emails in particular are taking so long to process with the above code and fix it. Alternatively, if you can suggest a better way to produce an identical looking PDF from a .msg file, we’d be happy to change our code.

@kidkeogh The problem is caused by long time required to load the images. You can either skip loading images using IResourceLoadingCallback:

Aspose.Email.MailMessage msg = Aspose.Email.MailMessage.Load(@"C:\Temp\in.msg");
msg.Save(@"C:\Temp\tmp.mhtml", Aspose.Email.SaveOptions.DefaultMhtml);
LoadOptions opt = new LoadOptions();
 opt.ResourceLoadingCallback = new SkipImagesCallback();
Aspose.Words.Document doc = new Aspose.Words.Document(@"C:\Temp\tmp.mhtml", opt);
doc.Save(@"C:\Temp\out.pdf");
private class SkipImagesCallback : IResourceLoadingCallback
{
    public ResourceLoadingAction ResourceLoading(ResourceLoadingArgs args)
    {
        Console.WriteLine(args.OriginalUri);
        return ResourceLoadingAction.Skip;
    }
}

Or decrease web request timeout:

Aspose.Email.MailMessage msg = Aspose.Email.MailMessage.Load(@"C:\Temp\in.msg");
msg.Save(@"C:\Temp\tmp.mhtml", Aspose.Email.SaveOptions.DefaultMhtml);
HtmlLoadOptions opt = new HtmlLoadOptions();
opt.WebRequestTimeout = 100;
Aspose.Words.Document doc = new Aspose.Words.Document(@"C:\Temp\tmp.mhtml", opt);
doc.Save(@"C:\Temp\out.pdf");
1 Like

Thank you Alexey. I’m on leave until Wednesday but I’ll give that a go then.

@alexey.noskov

Okay I must admit I am very confused.

I implemented your first suggestion using the SkipImagesCallback class.

The preview does indeed generate much faster, but I need to understand what it does differently, as the output looks identical. I had half expected to see it produce a PDF with the images removed, but they’re still there. When or how will it produce a different looking output? I need to know this as we also convert emails to PDFs for other purposes where we cannot have it change the look and feel of the output.

Conversely, if it would always create the same output, I can’t help but wonder why anyone would ever NOT set the LoadOptions as per your first example.

@kidkeogh The document might contain embedded and extremal/linked images. The code with IResourceLoadingCallback skips loading only external resources, so embedded images are still there.

1 Like

Thanks for the clarification. Okay in that case I will need to handle two different scenarios, a “quick” scenario for Preview purposes and a “slow” one when we need to faithfully render the document for inclusion in other documents.

In any case, your suggestion was very helpful and you can consider this question “resolved”. Thanks again.

1 Like