Issue with Comments Spanning Multiple Paragraphs in Aspose.Words

Hello!

We are experiencing an issue when working with comment ranges using Aspose.Words for Java. Specifically, we have a scenario where comment markers (CommentRangeStart and CommentRangeEnd) are placed across multiple paragraphs.

The structure after parsing our HTML input looks like this:

HTML Input:

<ul>
  <li>{{data-comment-start="019777a3-5104-71a8-985f-04e99e557fcc"}}11111</li>
  <li>2222{{data-comment-end="019777a3-5104-71a8-985f-04e99e557fcc"}}</li>
  <li>3333</li>
</ul>

Paragraph 1 run 1:
{{data-comment-start="019777a3-5104-71a8-985f-04e99e557fcc"}}11111

Paragraph 2 run 1:
2222{{data-comment-end="019777a3-5104-71a8-985f-04e99e557fcc"}}

Paragraph 3 run 1:
3333

The issue is that when the comment range starts in one paragraph and ends in another, the Aspose comment does not highlight the intended range properly. It appears the CommentRangeEnd marker is not recognized correctly, resulting in the comment range ending prematurely or being improperly associated.

Could you please provide guidance on how to correctly implement comment ranges that span across multiple paragraphs?
Is there an official recommendation or best practice to ensure CommentRangeStart and CommentRangeEnd markers correctly highlight content across paragraph boundaries?

Thank you very much for your assistance.

Below is a snippet of code that works well, even with overlapping fragments—but only within a single paragraph.

    public static void insertAsposeCommentsFromMarkers(
            DocumentBuilder builder,
            Map<UUID, Discussion> discussionMap
    ) {
        com.aspose.words.Document asposeDoc = builder.getDocument();

        Pattern markerPattern = Pattern.compile(
                "\\{\\{data-comment-(start|end)=\"([^\"]+)\"\\}\\}"
        );

        for (com.aspose.words.Node node : asposeDoc.getChildNodes(NodeType.PARAGRAPH, true).toArray()) {
            Paragraph paragraph = (Paragraph) node;

            Map<UUID, Comment> commentMap = new HashMap<>();
            Map<UUID, CommentRangeStart> openRanges = new HashMap<>();

            List<Run> runs = new ArrayList<>(Arrays.asList(paragraph.getRuns().toArray()));
            if (runs.isEmpty()) continue;

            for (Run run : new ArrayList<>(runs)) {
                String runText = run.getText();

                if (runText == null) continue;

                Matcher matcher = markerPattern.matcher(runText);
                List<Marker> markers = new ArrayList<>();
                while (matcher.find()) {
                    String type = matcher.group(1);
                    String uuidStr = matcher.group(2);
                    try {
                        UUID discussionUuid = UUID.fromString(uuidStr);
                        markers.add(new Marker(type, discussionUuid, matcher.start(), matcher.end()));
                    } catch (IllegalArgumentException e) {
                        log.error("Invalid UUID: {}", uuidStr);
                    }
                }

                int lastPos = 0;

                for (Marker marker : markers) {
                    if (marker.start > lastPos) {
                        Run textRun = (Run) run.deepClone(false);
                        textRun.setText(runText.substring(lastPos, marker.start));
                        paragraph.insertBefore(textRun, run);
                    }

                    if ("start".equals(marker.type)) {
                        Comment comment = commentMap
                                .computeIfAbsent(marker.uuid, u -> createAsposeCommentByParagraph(paragraph, asposeDoc, discussionMap, u));
                        if (comment != null) {
                            CommentRangeStart start = new CommentRangeStart(asposeDoc, comment.getId());
                            paragraph.insertBefore(start, run);
                            openRanges.put(marker.uuid, start);
                        } else {
                            log.error("Aspose comment for discussion UUID: {} is null on start", marker.uuid);
                        }
                    } else {
                        if (openRanges.containsKey(marker.uuid)) {
                            Comment comment = commentMap.get(marker.uuid);
                            if (comment != null) {
                                CommentRangeEnd endMarker = new CommentRangeEnd(asposeDoc, comment.getId());
                                paragraph.insertBefore(endMarker, run);
                                openRanges.remove(marker.uuid);
                                log.debug("Added CommentRangeEnd for discussion UUID: {}", marker.uuid);
                            } else {
                                log.error("Aspose comment for discussion UUID: {} is null on end", marker.uuid);
                            }
                        } else {
                            log.error("No comment found for CommentRangeEnd with discussion UUID: {}", marker.uuid);
                        }
                    }

                    lastPos = marker.end;
                }

                if (lastPos < runText.length()) {
                    Run textRun = (Run) run.deepClone(false);
                    textRun.setText(runText.substring(lastPos));
                    paragraph.insertBefore(textRun, run);
                }

                run.remove();
            }
        }
    }

    private static Comment createAsposeCommentByParagraph(
            com.aspose.words.Paragraph paragraph,
            com.aspose.words.Document doc,
            Map<UUID, Discussion> discussionMap,
            UUID uuid
    ) {
        Discussion disc = discussionMap.get(uuid);
        if (disc == null || disc.getComments().isEmpty()) {
            return null;
        }

        ru.vtb.dc.agreement.model.discussion.Comment first = disc.getComments().get(0);
        Comment comment = new Comment(doc, first.getCreatedBy(), first.getCreatedBy(), fromLocalDateTime(first.getCreatedAt()));
        comment.setText(first.getBody());

        paragraph.appendChild(comment);

        disc.getComments().stream().skip(1).forEach(reply -> {
            try {
                comment.addReply(reply.getCreatedBy(),
                        reply.getCreatedBy(),
                        fromLocalDateTime(reply.getCreatedAt()),
                        reply.getBody());
            } catch (Exception e) {
                log.error("Error while creating comment replies for discussion UUID: {}", uuid);
            }
        });

        return comment;
    }

    private Date fromLocalDateTime(LocalDateTime localDateTime) {
        return Date.from(localDateTime.atZone(ZoneId.systemDefault()).toInstant());
    }

    private static class Marker {
        String type;
        UUID uuid;
        int start;
        int end;

        Marker(String type, UUID uuid, int start, int end) {
            this.type = type;
            this.uuid = uuid;
            this.start = start;
            this.end = end;
        }
    }
}

@Liubei

To address issues with comment ranges that span multiple paragraphs in Aspose.Words for Java, it’s essential to ensure that the CommentRangeStart and CommentRangeEnd markers are correctly positioned and recognized.

Best Practices for Implementing Comment Ranges Across Paragraphs

  1. Creating Comment Ranges: When you create a comment, you must associate it with both a CommentRangeStart and a CommentRangeEnd. These should be placed around the text that the comment refers to. For example:

    Comment comment = new Comment(doc, "Author", "Initials", new Date());
    comment.getParagraphs().add(new Paragraph(doc));
    comment.getFirstParagraph().getRuns().add(new Run(doc, "Comment text."));
    
    CommentRangeStart start = new CommentRangeStart(doc, comment.getId());
    CommentRangeEnd end = new CommentRangeEnd(doc, comment.getId());
    
    // Insert the start and end markers around the text in different paragraphs
    run1.getParentNode().insertAfter(start, run1);
    run3.getParentNode().insertAfter(end, run3);
    
  2. Correct Placement: Ensure that CommentRangeStart is placed immediately before the first character of the first paragraph and CommentRangeEnd is positioned just after the last character of the last paragraph that the comment should cover. This ensures that the comment correctly highlights the intended text.

  3. Handling Multiple Paragraphs: If your comment spans multiple paragraphs, make sure to insert the CommentRangeStart in the first paragraph and the CommentRangeEnd in the last paragraph where the comment should apply. This allows Aspose.Words to recognize the entire range across the paragraphs.

  4. Node Structure: Remember that CommentRangeStart and CommentRangeEnd are inline-level nodes and must be children of a paragraph node. Ensure that they are not misplaced outside their intended context.

  5. Testing: After implementing the comments, it’s crucial to test the document thoroughly. Check if the comments are rendered correctly in your output format (e.g., DOCX, PDF) to verify that the highlighting works as expected.

By following these guidelines, you should be able to implement comments in Aspose.Words that span multiple paragraphs without issues. If problems persist, consider reviewing your code for any placement issues of the comment markers.

For further reference, you can check out the official documentation on Working with Comments in Aspose.Words for Java which includes examples and detailed explanations.

Example Code Snippet

Here’s a simplified example showing how to set up comments that span multiple paragraphs:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);

// First paragraph
builder.Writeln("Some text here.");
Run run1 = builder.InsertRun("Some highlighted text.");

// Second paragraph
builder.Writeln("Following text.");
Run run3 = builder.InsertRun("Continued highlighting.");

// Create comment
Comment comment = new Comment(doc, "Author", "A", new Date());
comment.GetParagraphs().Add(new Paragraph(doc));
comment.GetFirstParagraph().GetRuns().Add(new Run(doc, "This is a comment."));

// Create comment range markers
CommentRangeStart start = new CommentRangeStart(doc, comment.GetId());
CommentRangeEnd end = new CommentRangeEnd(doc, comment.GetId());

// Insert markers
run1.GetParentNode().InsertAfter(start, run1);
run3.GetParentNode().InsertAfter(end, run3);
run3.GetParentNode().InsertAfter(comment, end);

// Save the document
doc.Save("output.docx");

In this example, the comment correctly spans the text across two paragraphs, allowing for proper highlighting when viewed in a document viewer.

References

@Liubei Aspose.Words can preserve comment ranges using special tags in HTML like this:

<html>
	<head>
		<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
		<meta http-equiv="Content-Style-Type" content="text/css" />
		<meta name="generator" content="Aspose.Words for .NET 25.6.0" />
		<title>
		</title>
	</head>
	<body style="line-height:116%; font-family:Calibri; font-size:12pt">
		<div>
			<p style="margin-top:0pt; margin-bottom:8pt">
				<span>This </span><a name="_cmntref1"><span>is the first paragraph</span></a>
			</p>
			<p style="margin-top:0pt; margin-bottom:8pt">
				<span>This is the</span><span style="-aw-comment-end:_cmntref1"></span><a href="#_cmnt1" style="text-decoration:none"><span style="font-size:8pt">[AN1]</span></a><span> second one</span>
			</p>
		</div>
		<hr style="width:33%; height:1px; text-align:left" />
		<div id="_cmnt1" style="-aw-comment-author:'Alex Noskov'; -aw-comment-datetime:'2025-06-16T16:49:00'; -aw-comment-initial:'AN'">
			<p style="margin-top:0pt; margin-bottom:8pt; line-height:normal; font-size:10pt">
				<a href="#_cmntref1" style="text-decoration:none"><span style="font-size:8pt">[AN1]</span></a><span>commet</span>
			</p>
		</div>
	</body>
</html>

If you need to use your custom tags, you can use the following code:

Document doc = new Document("C:\\Temp\\in.html");
DocumentBuilder builder = new DocumentBuilder(doc);

Pattern markerPattern = Pattern.compile("\\{\\{data-comment-(start|end)=\"([^\"]+)\"\\}\\}");
FindReplaceOptions opt = new FindReplaceOptions();
opt.setUseSubstitutions(true);
doc.getRange().replace(markerPattern, "$0", opt);

HashMap<String, Comment> commentMap = new HashMap<String, Comment>();
for (Run r : (Iterable<Run>)doc.getChildNodes(NodeType.RUN, true))
{
    Matcher matcher = markerPattern.matcher(r.getText());
    if (matcher.find())
    {
        String guid = matcher.group(2);
        if (r.getText().contains("data-comment-start"))
        {

            // Create comment and comment range start.
            Comment c = new Comment(doc);
            c.appendChild(new Paragraph(doc));
            c.getFirstParagraph().appendChild(new Run(doc, "this is comment"));

            commentMap.put(guid, c);

            CommentRangeStart start = new CommentRangeStart(doc, c.getId());
            r.getParentNode().insertBefore(start, r);
        }
        if (r.getText().contains("data-comment-end"))
        {
            if (commentMap.containsKey(guid))
            {
                Comment c = commentMap.get(guid);
                CommentRangeEnd end = new CommentRangeEnd(doc, c.getId());
                r.getParentNode().insertAfter(c, r);
                r.getParentNode().insertAfter(end, r);
            }
        }
        r.remove();
    }
}

doc.save("C:\\Temp\\out.docx");

out.docx (11.7 KB)