Performance issue converting HTML to PDF due to loading external resources

Hi, I found that creating an Aspose.Pdf.Document from an HTML file have some performance issue when the component tries to load external resources linked the HTML.

I have attached a sample HTML file, which is the Google search result page of searching the word “Google”: Google Search.zip (158.8 KB)

Simply creating the PDF Document object from this HTML while internet is connected can take up to 45 seconds:

new Document("Google Search.html", new HtmlLoadOptions());

While I disconnect from the network, this only takes 4 seconds.

After inspecting the network traffic, I realised that the PDF component was trying to load some external resources, for example:

* http://ssl.gstatic.com/gb/images/i2_2ec824b0.png
* http://ssl.gstatic.com/gb/images
* http://www.gstatic.com/images/branding/googlemic/2x/googlemic_color_24dp.png
* http://www.gstatic.com/images/branding/googlemic/2x

These requests were failing due to the server returning 405.

I then tried specifying a custom loader of external resources:

new Document("Google Search.html", new HtmlLoadOptions
{
    CustomLoaderOfExternalResources = _ => null
});

But it doesn’t seem to stop the default loading behaviour. The custom loading code is only called after the default loader fails to load the resource.

Is there a way to skip loading the external resources, or to cancel the HTML loading process during the long waiting period?

@leon.zhou

At the moment, there is no option for HtmlLoadOptions to specify the skipping of external resource loading. We need to further investigate whether this enhancement can be made or not. For the purpose, we have logged an enhancement ticket as PDFNET-49734 in our issue tracking system. We will further look into details of this scenario and let you know once the ticket is resolved. Please be patient and spare us some time.

We apologize for the inconvenience.

Hi,
I’m facing a similar performance issue with HtmlLoadOptions when converting Html to Pdf. The conversion is happening within an Azure function and even if I specify the resources in the CustomLoaderOfExternalResources it still will try to load the external resource and time out in the process. We are using Aspose version 22.1. Has this issue been fixed in a later version?

@NDechamma

Regretfully, the issue has not been yet fixed due to other pending issues in the queue logged prior to it. However, we will surely let you know in this forum thread as soon as some progress is made towards its resolution. Please spare us some time.

We apologize for the inconvenience.