Downloading Inline Images from MSG Problem

Hi,

We have an e-mail that is causing Aspose to not return the processing thread due to an issue downloading some of the inline images in the e-mail.

Code is as follows:

using var message = MailMessage.Load(sourceFile.FullName);
message.TimeZoneOffset = timeZoneInfo.GetUtcOffset(message.Date); // show correct date
var mhtSaveOptions = new MhtSaveOptions { MhtFormatOptions = MhtFormatOptions.WriteHeader | MhtFormatOptions.WriteCompleteEmailAddress };

using var msgStream = new MemoryStream();
message.Save(msgStream, mhtSaveOptions);
msgStream.Position = 0;

var document = new Aspose.Words.Document(msgStream);
document.Save(finalFile.FullName, Aspose.Words.SaveFormat.Pdf);

Is there a secure method for uploading the MSG file for you to review?

Ben.

@bhogan It is safe to attach documents in the forum, only and and Aspose staff can access your attachments.
In your case, you can use IResourceLoadingCallback either to control the external images loading process or to skip loading external resources.

SOURCE_737253de048e4dd0b70081834c8ad3d3.zip (106.5 KB)

Hi Alexey.

I’ve attached the file we are having trouble with. What is strange is that the call returns after around 25 minutes. We checked the image files that were having trouble downloading and we could download them instantly from the browser.

Ben.

@bhogan Thank you for additional information. It looks like problem with downloading images from https://media.radissonhotels.net. I implemented IResourceLoadingCallback and set timeout to 1 second:

LoadOptions opt = new LoadOptions();
opt.ResourceLoadingCallback = new ResourceLoadingCallback();

// out.mhtml is MHTML document produced by Aspose.Email
var document = new Aspose.Words.Document(@"C:\Temp\out.mhtml", opt);
document.Save(@"C:\Temp\out.pdf", Aspose.Words.SaveFormat.Pdf);
private class ResourceLoadingCallback : IResourceLoadingCallback
{
    public ResourceLoadingAction ResourceLoading(ResourceLoadingArgs args)
    {
        Console.WriteLine(args.OriginalUri);

        using (WebDownload webClient = new WebDownload())
        {
            webClient.Timeout = 1000;
            try
            {
                args.SetData(webClient.DownloadData(args.OriginalUri));
                return ResourceLoadingAction.UserProvided;
            }
            catch (WebException ex)
            {
                Console.WriteLine("Skiped");
                return ResourceLoadingAction.Skip;
            }
        }
    }
}
public class WebDownload : WebClient
{
    /// <summary>
    /// Time in milliseconds
    /// </summary>
    public int Timeout { get; set; }

    public WebDownload() : this(60000) { }

    public WebDownload(int timeout)
    {
        this.Timeout = timeout;
    }

    protected override WebRequest GetWebRequest(Uri address)
    {
        var request = base.GetWebRequest(address);
        if (request != null)
        {
            request.Timeout = this.Timeout;
        }
        return request;
    }
}

So there is some problems with downloading the images from your external resources programmatically.

1 Like

Hi Alexey,

Thank you so much for your assistance, it’s an odd one. When requesting the images manually (through the browser) they come back instantly. Appreciate the sample code you’ve provided to deal with it.

Very much appreciated. Thankyou.

Ben.

1 Like

@bhogan Yes, I also noticed the images are accessible in the browser, but are not accessible programmatically using WebClient.

Thanks Alexey. We did a bit more testing today and discovered two things.

  1. We changed the request headers because it seemed the server was rejecting or was really slow responding to connections from unknown clients.

  2. Some of the PNG files we were downloading were actually the WebP file format.

https://stackoverflow.com/questions/48790438/why-can-the-c-sharp-httpclient-not-call-this-url-always-times-out

The WebP format in ByteArray is not accepted in the ResourceLoading.SetData call so we needed to convert it to PNG otherwise Aspose would insert placeholders instead of the image.

If you need any of our sample code just let me know. Is this something Aspose will fix in the future?

@bhogan Currently Aspose.Words does not support WEBP images. This issue is logged as WORDSNET-21107.
Could you please let us know what is the target framework of your application? Theoretically, we can quickly fix this for .NET Standard version, because it uses SkiaSharp to deal with graphics and SkiaSharp supports WEBP format.
For now, as a workaround, you can use SkiaSharp to convert the image to PNG. Something like this:

private class ResourceLoadingCallback : IResourceLoadingCallback
{
    public ResourceLoadingAction ResourceLoading(ResourceLoadingArgs args)
    {
        if (args.OriginalUri == "http://mysite.com/webp.png")
        {
            // Here you get WEBP bytes.
            byte[] webpBytes = File.ReadAllBytes(@"C:\Temp\webp.png");

            using (SkiaSharp.SKBitmap bmp = SkiaSharp.SKBitmap.Decode(webpBytes))
            using (MemoryStream pngStream = new MemoryStream())
            {
                bmp.Encode(pngStream, SkiaSharp.SKEncodedImageFormat.Png, 100);

                args.SetData(pngStream.ToArray());
                return ResourceLoadingAction.UserProvided;
            }
        }

        return ResourceLoadingAction.Default;
    }
}

Also, I have logged the problem with hanging upon loading mhtml file as WORDSNET-23888. We will additionally investigate whether we can fix this issue in Aspose.Words code, like you suggested.

The issues you have found earlier (filed as WORDSNET-23888) have been fixed in this Aspose.Words for .NET 22.7 update also available on NuGet.

The issues you have found earlier (filed as WORDSNET-21107) have been fixed in this Aspose.Words for Java 23.12 update.