Image is Rendered as Cross | DocumentBuilder.insertHtml | HTML to DOCX Conversion using Java

We are evaluating upgrade of our applications to Aspose words 21.3
There seems to be an issue with this version related to inserting images (as part of HTML) in word.
While running on windows machine, the conversion works fine. However on the linux servers, the images are not inserted and we see a cross symbol indicating image is not available.

Please Note: the same code works fine with lower versions (19.12, 19.11, 18.6) but we need to upgrade to latest for some improvements

We use aspose-words to convert html to word using following code :

public static void  insertHtmlAtBookmark(Document wordDoc,String bkmName , String htmlContent )
DocumentBuilder docBuilder = new DocumentBuilder(wordDoc);
    		Bookmark bkm = wordDoc.getRange().getBookmarks().get(bkmName);
    		if(bkm!=null)
    		{
    			docBuilder.moveToBookmark(bkmName);
	    	    docBuilder.insertHtml(htmlContent, false); 
}
}

attaching sample html and the images for testing purposes

@jinesh.parikhmca1983

Please make sure that the image in HTML is accessible to the application. Please open the HTML in browser at Linux server to check either image is rendered or not.

If you still face problem, please ZIP and attach your input Word document, HTML, and problematic output document here for testing. We will investigate the issue and provide you more information on it.

@tahir.manzoor
Verified the image location on the html is correct and the image is accessible. We also tested with the further recent version of aspose-words (21.4) and the issue still persists.

Attached is the html(input), word(output) files with image in images folder.

Sample Code to get the html div for insertion in word copied below:

{
    Element targetHtml = new Element("html");
    Element targetBody = new Element("body");
    String bkmName = "bkm_BullBear";
    Element replacementDiv = htmlDoc.getElementsByClass("wp-block-research-bull-bear-block").first();
				
    targetBody.appendChild(replacementDiv.clone());
    targetHtml.appendChild(targetHtmlHead);
    targetHtml.appendChild(targetBody);
//insert targetHtml at the bookmark location using code provided earlier
}

<a class="attachment" href="/uploads/default/50337">htmlToWord_insertion.zip</a> (224.6 KB)

@jinesh.parikhmca1983

In your HTML, the image path is C:\export\rschapps\asposeError\htmlToWord_insertion\images\image0.png. Please check the attached image for detail.

image path.png (54.8 KB)

If you insert this HTML into Word document at Linux, the image will not be imported into document due to incorrect image path.

@tahir.manzoor
The image path in the HTML folder is like that due to sample run from windows server. On linux, the path is in expected format which is “/export/rschapps/…”

PLEASE NOTE: We are able to run the same code properly with older versions: 19.11,19.12,18.6 without any other change. Hence our code and setup must be fine only.

Can you please try to run the same at your end once and verify if the image is coming up fine on linux machine

We’d appreciate a quick resolution in this matter as it is of business criticality for our firm

@jinesh.parikhmca1983

We have imported the shared HTML into Aspose.Words’ DOM and save it to DOCX at Linux. We have not found the shared issue. Please check the attached modified HTML and output DOCX.
Docs.zip (280.3 KB)

If you still face problem, please attach the following resources here for testing:

  • Your input Word document.
  • Your HTML document with the same image path that you are using at Linux.
  • Please create a simple Java application (source code without compilation errors) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Hi @tahir.manzoor
we are able to replicate the same problem with a plain java class .

I’m attaching a sample with a jar that uses the aforementioned code to insert the input html into the input word file at bookmark “bkm_BB”

Please try running the same in 2 below mentioned cases:

  • using aspose-words-21.3 dependency: This one fails the insert the image correctly in the docx file (also included)
  • using aspose words 19.11 jar: This one has the same code as above and still works fine for the same input files

CLI command to run the attached sample:

]$ java -cp distribution-test-0.0.1.jar:lib/* -DInputHtmlPath=/tmp/generateWordFromHtml/inputHtml.html -DInputWordPath=/tmp/generateWordFromHtml/inputWord.docx -DOutputWordPath=/tmp/generateWordFromHtml/outputWord.docx -DAsposeLicPath=/tmp/generateWordFromHtml/Aspose.Total.Java.lic com.citi.distribution_test.GenerateWordFromHtml

Expected log in console:
Setting Aspose Word License : /tmp/generateWordFromHtml/Aspose.Total.Java.lic
Html insertion successful at bookmark bkm_BB

generateWordFromHtml_21.3.zip (153.2 KB)

@jinesh.parikhmca1983

Please share the code example that you are using for testing. We will test this case with the latest version of Aspose.Words i.e. 21.5 and old version 19.11.

The code has already been shared in our previous responses. Attaching the java class file that can be used and is part of the jar shared earlier. The previous zip contains the required input files that you’d need

GenerateWordFromHtml.zip (981 Bytes)

@jinesh.parikhmca1983

Thanks for sharing the code example. Please spare us some time for the investigation of this issue at Linux. We will get back to you soon.

@jinesh.parikhmca1983

We have logged this problem in our issue tracking system as WORDSNET-22235. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

@jinesh.parikhmca1983 The issue WORDSNET-22235 is resolved. The fix will be included into the next release. We are gong to start release preparation in a few days. We will let you know once it is published.

Team, Is there any update on the release date for 21.6 which will contain the fix for WORDSNET-22235 ?

@jinesh.parikhmca1983

Hopefully, the next version of Aspose.Words for Java i.e. 21.6 will be available in the next week.

The issues you have found earlier (filed as WORDSNET-22235) have been fixed in this Aspose.Words for .NET 21.6 update and this Aspose.Words for Java 21.6 update.