Extra paragraph inserted when comparing two documents

When comparing two Word documents, we sometimes get an extra paragraph incorrectly added in the compared result. Please find attached two examples.

Case 1 (C1)

  • Comparing c1_doc1.docx with c1_doc2.docx
  • C1_Compare_With_MSWord.docx has been generated with MS Word, showing a correct result
  • C1_Compare_With_Aspose._EXTRA_PARAGRAPH.xml has been generated with Aspose and contains an extra paragraph, located above the table.
  • The extra paragraph mark is clearly visible when showing paragraph marks and accepting all changes (or opting for showing No markups)

Case 2 (C2)

  • Comparing c1_doc1.docx with c1_doc2.docx
  • The extra paragraph in C1_Compare_With_Aspose._EXTRA_PARAGRAPH.xml is located above the image

The attached sample Code.txt can be used for generating the compare, the attached files were compared with Aspose.Words version 25.9.

ExtraP.zip (315.6 KB)

@andrelafond

Summary
When using Aspose.Words v25.9 to compare the two DOCX files provided in the zip, the resulting comparison document contains an extra empty paragraph that is not present in the Microsoft Word comparison result. In case 1 the paragraph appears above a table, and in case 2 it appears above an image.

Why it happens
The evidence only describes the symptom (an unexpected empty paragraph) and does not contain a definitive explanation from Aspose. At this time we cannot confirm the exact internal reason why the compare engine inserts that paragraph. It may be related to how the compare algorithm handles block‑level elements that follow a paragraph, but this is speculative.

What to do next

  1. Verify with a newer Aspose.Words version – Run the same comparison using the latest available release (e.g., 26.x). If the extra paragraph disappears, the issue was likely fixed in a later build.
  2. Inspect the source documents – Unzip the DOCX files and open word/document.xml. Look for any empty <w:p> elements immediately before the <w:tbl> or <w:drawing> nodes. If such a paragraph exists, it could be carried over by the compare operation.
  3. Contact Aspose support – Provide the minimal reproducible sample (the two original DOCX files and the code you used) along with the version number. Ask whether the extra paragraph is a known limitation or a regression.
  4. Work‑around (post‑processing) – If the paragraph cannot be avoided, you can remove empty paragraphs programmatically after the comparison:
Document result = new Document("CompareResult.docx");
for (Paragraph p : (Iterable<Paragraph>) result.getChildNodes(NodeType.PARAGRAPH, true)) {
if (p.getRuns().getCount() == 0 && p.getParentNode().getNodeType() == NodeType.BODY) {
Node next = p.getNextSibling();
if (next != null && (next.getNodeType() == NodeType.TABLE ||
(next.getNodeType() == NodeType.PARAGRAPH && ((Paragraph)next).getChildNodes(NodeType.SHAPE, true).getCount() > 0))) {
p.remove();
}
}
}
result.save("CompareResult_Cleaned.docx");

This removes stray empty paragraphs that appear directly before a table or an image.
5. Create a minimal test case – Build a tiny DOCX that contains only a paragraph followed by a table (or an image) and run the comparison. If the extra paragraph still appears, it confirms the behavior is intrinsic to the compare algorithm for that layout.

Until a definitive cause is identified by Aspose, the steps above provide a practical way to verify and mitigate the issue.

Note: The description of the problem comes directly from the provided evidence. The suggested diagnostics and work‑arounds are based on common practices for handling unexpected nodes in Aspose.Words documents, but they have not been confirmed by the evidence itself.

:warning: I could not fully verify this explanation against the retrieved sources. Please double-check the behaviour in your environment and confirm the exact SDK/version.

@andrelafond
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-28822

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.