Loading mhtml file taking a long time

Hi,

I am trying to convert attached mhtml file to pdf using Aspose Words (Java) using a simple code as below -

LoadOptions loadOptions = new LoadOptions();
loadOptions.setLoadFormat(LoadFormat.MHTML);
Document document = new Document(“C:\Sample.mhtml”,loadOptions);
document.save(“C:\Sample.pdf”, SaveFormat.PDF);

The statement marked in bold above is taking nearly 3 minutes to get executed even though this is a very small file (just 57 kb).

Sample.zip (46.1 KB)

Looks like since the input file has links to some online images, Aspose is trying to download all of that while loading the file, but since the machine where my application is running doesn’t have internet connection, Aspose waits until the timeout before proceeding further, and that’s why the long delay.

Is there any way to turn off this image download option OR reduce the timeout ??

Thanks,
Rajiv

@rajivrp,

Thanks for your inquiry. We suggest you please use HtmlLoadOptions.WebRequestTimeout property. You can set the number of milliseconds to wait before the web request times out using this property. The number of milliseconds that Aspose.Words waits for a response, when loading external resources (images, style sheets) linked in HTML and MHTML documents.

Moreover, LoadOptions.ResourceLoadingCallback property allows to control how external resources (images, style sheets) are loaded when a document is imported from HTML, MHTML.

Please check the following code example. Hope this helps you.

HtmlLoadOptions loadOptions = new HtmlLoadOptions();
loadOptions.setLoadFormat(LoadFormat.MHTML);
loadOptions.setWebRequestTimeout(10);
loadOptions.setResourceLoadingCallback(new HandleResourceLoading());

Document doc = new Document(MyDir + "Sample.mhtml",loadOptions);
doc.save(MyDir + "18.2.pdf");

public class HandleResourceLoading implements IResourceLoadingCallback
{
    public int resourceLoading(ResourceLoadingArgs args)
    {
        //ResourceLoadingArgs.OriginalUri
        if (args.getResourceType() == ResourceType.IMAGE)
            return ResourceLoadingAction.SKIP;

        return ResourceLoadingAction.DEFAULT;
    }
}

Thanks Tahir.

Using setWebRequestTimeout certainly helps. But this API seems to have been deprecated. Just wondering would it be safe to productize it ?

Regards,
Rajiv

@rajivrp,

Thanks for your inquiry. Please use HtmlLoadOptions.WebRequestTimeout property instead of LoadOptions.WebRequestTimeout.

Ohk got it :+1:

Regards,
Rajiv