Aspose Word - Bookmarks missing for paragraphs from the document after creating instance of LayoutEnumerator for document comparison

Hi,
We are a licenced Aspose User and we are using Aspose Words to compare word documents using Document.Compare method. We have noticed that for Aspose Words, when we try to compare two documents and create an instance of LayoutEnumerator and LayoutCollector then the document that we get, from the Document.Compare method after accepting revision, has no BookmarkStart and BookmarkEnd node for many paragraphs, due to which our functionality breaks, as we generate IDs for the paragraph blobs from Bookmark ID. The version that we are currently using for Aspose Word is 24.8.0.
I have verified that the bookmarks are there when I try to parse the parse paragraphs without any document comparison, it only seems to be the case when we compare documents. I have also attached the word files for your reference. After the comparison, the bookmarks for the paragraphs in the starting pages are missing. Here is also the code to reproduce the issue:

var v1 = Path.Join("TestFiles/v1.docx");
var document1 = new Document(v1);
var comments = document1.GetChildNodes(NodeType.Comment, true);
comments.Clear();
document1.Revisions.AcceptAll();
document1.RemoveMacros();


var v2 = Path.Join("TestFiles/v2.docx");
var document2 = new Document(v2);
comments = document2.GetChildNodes(NodeType.Comment, true);
comments.Clear();
document2.Revisions.AcceptAll();
document2.RemoveMacros();

var doc = document1;

var author = Guid.NewGuid().ToString();
doc.Compare(document2, author, DateTime.Now);

var revisions = doc.Revisions;
revisions
    .Where(rev => rev.RevisionType == RevisionType.FormatChange || rev.RevisionType == RevisionType.StyleDefinitionChange)
    .ToList()
    .ForEach(rev => rev.Accept());
var _enumerator = new LayoutEnumerator(doc);
var _collector = new LayoutCollector(doc);
var parentPara = (Paragraph)revisions[12].ParentNode;
var bookmarkStart = (BookmarkStart)parentPara.GetChildNodes(NodeType.Any, false)
    .FirstOrDefault(child =>
        child is BookmarkStart bookmark && bookmark.Name.StartsWith("Para"));

var bookmarkName = bookmarkStart?.Name;

If you check the bookmarkStart is null. This is a very critical issue for our application and would really appreciate if you can help us in this regard at your earliest. Thanks.

v1.docx (70.4 KB)

v2.docx (366.7 KB)

@omer.asalm There are duplicated bookmark names in the resulting document. When you create LayoutEnumerator and LayoutCollector Aspose.Words builds document layout. Before building document layout the document is validated. Since duplicated bookmarks names are not allowed in MS Word documents, they are removes upon validation.

Hi @alexey.noskov thanks for your reply. Did you mean that you found the same bookmark for different paragraphs or duplicate bookmarks within the paragraph. Because when I debug, I didn’t see multiple bookmarks within the paragraph. for example at revision index 12.
Anyway, it would be great if you can suggest what could be done for that case? Can we fix the bookmarks somehow? Thanks

@omer.asalm If you use the following simple code you will see that the documents has bookmarks with the same names:

Document v1 = new Document(@"C:\Temp\v1.docx");
Document v2 = new Document(@"C:\Temp\v2.docx");

List<string> sameBookmarkNames = v1.Range.Bookmarks.Where(b1 => v2.Range.Bookmarks.Any(b2 => b2.Name == b1.Name))
    .Select(b => b.Name).ToList();

foreach (string bkName in sameBookmarkNames)
    Console.WriteLine(bkName);

After comparing the document contents of the documents are concatenated and there might be a situation when bookmarks with the same name are in different paragraphs. MS Word does not allow bookmarks with the same names. So such bookmarks are removed and some paragraph might lost their bookmarks.

@alexey.noskov I understood your point, is there anything to prevent this, what should we do if we are getting duplicate bookmarks from the documents? We don’t want to loose the paragraphs

@omer.asalm You can try renaming bookmarks in the documents before comparing them to avoid duplicated bookmark names in the resulting document.