java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter when converting html to pdf with wrong img src

Hello,

We are converting an HTML content to PDF using Aspose.PDF for Java. We’ve noticed that if an html contains the <img> tag with a source that cannot be found, then the the html cannot be converted, the following exception is thrown:

java.lang.NoClassDefFoundError: javax/xml/bind/DatatypeConverter
	at com.aspose.pdf.internal.html.rendering.l0h.lI(Unknown Source)
	at com.aspose.pdf.internal.html.rendering.l0h.lI(Unknown Source)
	at com.aspose.pdf.internal.l43f.l0l.lf(Unknown Source)
	at com.aspose.pdf.internal.l43u.ly.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l43h.lf.lI(Unknown Source)
	at com.aspose.pdf.internal.l42u.lj.lI(Unknown Source)
	at com.aspose.pdf.internal.l42j.lI.lI(Unknown Source)
	at com.aspose.pdf.internal.l51p.lI.lI(Unknown Source)
	at com.aspose.pdf.internal.l43v.lt.lI(Unknown Source)
	at com.aspose.pdf.internal.l43v.lf.lj(Unknown Source)
	at com.aspose.pdf.internal.html.collections.lj.lj(Unknown Source)
	at com.aspose.pdf.internal.html.collections.lj.hasNext(Unknown Source)

This can be reproduced with the latest 23.12 Aspose version, and we are using Java 17.

Sample java code:

	HtmlLoadOptions options = new HtmlLoadOptions();
	Document htmlDocument = new Document(RESOURCE_DIR + "img_src.html", options);
	htmlDocument.save("img_src.pdf");

Sample html source:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>
<div>
    <img src="no_image.png"/>
</div>
</body>
</html>

We would expect either a successful conversion without an image in the output pdf, or at least a ‘normal’ exception that we could catch and inform a user about their wrong input. Our users can add their own content and we cannot make sure their content is valid.

Why is there a dependency on the DatatypeConverter in case of an incorrect input? Is there a workaround to this except adding a jar with the DatatypeConverter class to the classpath that we are not willing to do?

Thank you for answers

Arjana

@arjana

We were not able to replicate the exception in our environment using 23.12 version of the API. Attached is the generated output.
html2PDF.pdf (2.5 KB)

Can you please share which JDK version are you using? Can you please specify some steps to replicate the same issue? You can share a sample console application in .zip format with us that we can use to reproduce the issue and address it accordingly.

JDK on Windows:

java version “17.0.7” 2023-04-18 LTS
Java™ SE Runtime Environment (build 17.0.7+8-LTS-224)
Java HotSpot™ 64-Bit Server VM (build 17.0.7+8-LTS-224, mixed mode, sharing)

JDK on Linux (RedHat 8.9):

openjdk version “17.0.1” 2021-10-19
OpenJDK Runtime Environment Temurin-17.0.1+12 (build 17.0.1+12)
OpenJDK 64-Bit Server VM Temurin-17.0.1+12 (build 17.0.1+12, mixed mode, sharing)

Here is a maven project with sources:
NotFoundImage.zip (5.4 KB)

And here is a recorded screen of the application in action:
Screen_recording.zip (6.2 MB)

Or one can take the aspose 23.12 jar, unzip the compiled application to the same folder, and run the application:
aspose-images-1.0-SNAPSHOT.zip (5.2 KB)
java -cp “aspose-pdf-23.12.jar;aspose-images-1.0-SNAPSHOT.jar” org.aspose.Main

No specific actions are needed to replicate the case - just an html with a link to a non-existing image, it does not matter if a link is relative, absolute, or with a file: protocol.
In my sample project, there is another html img_src2.html which has a correct absolute link to an image, and it is converted successfully.

@arjana

We were able to replicate the issue in our environment with you sample project and JDK 17. Earlier we tested with JDK 1.8.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-43495

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.