Hello,
We recently noticed Bookmark DATA_LOSS warnings when converting HTML documents that contain comments into DOCX files.
Despite the warnings, the output DOCX still looks fine (comments are present and visible in Word).
We would like to confirm:
- Whether this is expected behavior or not.
- If the comment markup syntax has changed (e.g., -aw-comment-* attributes).
- How we should adjust our HTML to avoid these warnings.
Additional note: when exporting a Word document with a comment to HTML using Aspose we found the same markup syntax, and no warning occurred.
We weren’t able to find any documentation regarding the HTML comments format, is this documented somewhere?
Reproduction Code (Java):
@Test
void import_doc_from_html() throws Exception {
final String html = """
<html style="font-family: helvetica; font-size: 11pt;">
<body>
<p style="margin: 0; word-break: break-word;">te
<a data-cl-comment name="_cmntref1"></a>test
<span data-cl-comment style="-aw-comment-end:_cmntref1"> </span>
<a data-cl-comment href="#_cmnt1">[1]</a>
</p>
<p style="margin: 0; word-break: break-word;">teetete</p>
<div id="_cmnt1"
style="-aw-comment-author:'Concord Tech';
-aw-comment-datetime:'2025-10-24T09:28:38.171';
-aw-comment-initial:'CT'">
<p>
<a data-cl-comment href="#_cmntref1">[1]</a>
<span>test comment</span>
</p>
</div>
</body>
</html>
""";
final Document doc = new Document();
final DocumentBuilder builder = new DocumentBuilder(doc);
builder.insertHtml(html);
final var warningCollector = new WarningInfoCollection();
doc.setWarningCallback(warningCollector);
doc.save("html/Aspose_ImportHTMLWithComments.docx");
for (final var warning : warningCollector) {
System.err.println("Warning: " + WarningType.getName(warning.getWarningType())
+ " - " + warning.getDescription());
}
}
Observed output:
Warning: DATA_LOSS - Bookmark '_cmnt1' without corresponding BookmarkStart was removed.
Warning: DATA_LOSS - Bookmark '_cmnt1' without corresponding BookmarkEnd was removed.
Environment:
Aspose.Words for Java version: 25.9
Thanks in advance!