HTML to PDF, with Images - Performance and Proxy / TrustStore

I’m observing terrible conversion times when attempting to convert even the simplest HTML document to PDF (AposePDF.Java 20.5).

I have narrowed down a number of issues:

  • When attempting to download images from external resources, AsposePDF.Java does not use the configured javax.net.ssl.trustStore truststore - I can see with -Djavax.net.debug=all enabled that the default jsse truststore is being loaded, and not the one specified by the system property. If you are using slf4j, you also need to use jul-to-slf4j bridge, otherwise the messages about no trusted path are lost
  • If prior to the conversion, I manually use new URL("https://server.local/img.png").openConnection() then the configured truststore is used, so AsposePDF is not respecting the configured truststore.
  • AposePDF.java also explicitly requires http.proxyHost and http.proxyPort system properties set when running on a corp network, if they are not set then I see nearly 90 seconds hang up time during conversion. This is not related to the poor performance for the first load, as the second (and subsequent) conversion is around 90 seconds, which i presume is some url connection timeout issue. You can set http.proxyHost or port to anything and it will reduce the second+ conversion to around about 1 second.

So is there any specific guidance on using a custom truststore for external resource downloads, how to configure proxy, and why the following (when pulling resources from a server not trusted by the truststore) takes so long to convert if http.proxyHost is not set.

<html>
<body>
<h1>Title</h1>

<img src="https://server.local/img.png"
     width="100px"
     height="100px">

</body>
</html>

@chriswhite199

Would you kindly share the complete sample code snippet that you are using. We will test the scenario in our environment and address it accordingly.

Code to reproduce - you’ll need to be on a machine that can only access the internet via a proxy:

System.out.println("1st load:  " + new Date());
new Document(AbstractHtmlToPdfInterceptorTest.class.getResourceAsStream("/test.html"), new HtmlLoadOptions());

System.out.println("2nd load:  " + new Date());
new Document(AbstractHtmlToPdfInterceptorTest.class.getResourceAsStream("/test.html"), new HtmlLoadOptions());

System.out.println("Completed: " + new Date());

Output:

1st load:  Tue May 26 19:47:45 PDT 2020
2nd load:  Tue May 26 19:49:43 PDT 2020
Completed: Tue May 26 19:51:24 PDT 2020

Sample html:

<html>
<body>
<h1>Title</h1>

<img src="https://server.local/img.png"
     width="100px"
     height="100px">

</body>
</html>

If I run with the system property http.proxyHost set to localhost (at jre execution time as a -Dxxx=yyy option, not dynamically set in the class itself), the output shows reduced execution times:

1st load:  Tue May 26 19:52:11 PDT 2020
2nd load:  Tue May 26 19:52:29 PDT 2020
Completed: Tue May 26 19:52:29 PDT 2020

@chriswhite199

Thanks for providing requested detail.

We have logged an investigation ticket as PDFJAVA-39444 in our issue tracking system for the scenario. We will look into details of the issue and keep you posted with the status of ticket resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.

@asad.ali - Any update on this?

@chriswhite199

We are afraid that earlier logged ticket is not yet resolved. We will surely investigate and resolve it on first come first serve basis and let you know as soon as we have some definite updates in this regard. Please give us some time.

We apologize for your inconvenience.