Bullet Symbol is Lost after Inserting HTML into DOCX using Java

Hi,

I am facing an issue while using aspose words java to insert html into word document at a bookmark. The last bullet in the bullet list goes missing if the last bullet is empty. But when I insert the html to word file directly by using Insert tab in ms word it doesn’t get removed. I have attached the input word file, the html file used to insert html and the output word file in the zip attached to this topic.

The code used to do this is as follows:

String inputWordFilePath = "path/to/input/word/Test_Template.docx";
String inputHtmlFilePath = "path/to/input/html/Test.html";
org.jsoup.nodes.Document htmlDocument = Jsoup.parse(new File(inputHtmlFilePath), "UTF-8");
String html = htmlDocument.outerHtml();
Document document = new Document(inputWordFilePath);
DocumentBuilder documentBuilder = new DocumentBuilder(document);
documentBuilder.moveToBookmark("bkm");
documentBuilder.insertHtml(html, false);
document.save("path/to/output/word/testOut.docx");

The aspose words java version used is 21.6.0

<dependency>
	<groupId>com.aspose</groupId>
	<artifactId>aspose-words</artifactId>
	<version>21.6.0</version>
	<classifier>jdk17</classifier>
</dependency>

Zip attachment: issue.zip (25.1 KB)

Please advice.

@jinesh.parikhmca1983

We have logged this problem in our issue tracking system as WORDSNET-22574. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

@jinesh.parikhmca1983

Please use DocumentBuilder.InsertHtml method as shown below to get the desired output.

Document document = new Document();
DocumentBuilder documentBuilder = new DocumentBuilder(document);
documentBuilder.insertHtml("Your HTML", HtmlInsertOptions.REMOVE_LAST_EMPTY_PARAGRAPH);

The issues you have found earlier (filed as WORDSNET-22574) have been fixed in this Aspose.Words for .NET 21.9 update.