Bookmark Comparison Normal_0 style change causing Issue

Hi,

Question 1:
I want to compare two xml documents and find out what are all the bookmarks are modified.
When using compare method, I am able to get the revisions but from the revision i am not able to get the corresponding bookmark. Is there any way to get the respective bookmark from the revision?

Question 2:
Currently, in order to get the edited bookmarks. I foolwed the below approach
1.Loaded both source(Original.xml) and destination xml(Modified.xml) as Aspose wordml documents.
2.Iterate through all the bookmarks in the source document and extract content for each bookmark and generate separate document for each bookmark. Similarly for destination documents. Now comparing each source bookmark document with destination bookmark document. If revisions are present for that bookmark document then concluded that bookmark edited.

For extracting content and generating document I used below methods
ArrayList extractContent(Node startNode, Node endNode, boolean isInclusive)
Document generateDocument(Document srcDoc, ArrayList nodes)

In generate document method I used keep source formatting
NodeImporter importer = new NodeImporter(srcDoc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);

The problem now I am facing is which comparing source bookmark document with destination bookmark document. I am getting revision that style Normal_0 change, but I didn’t modify anything related to style.
while generating bookmark document and appending the source bookmark content.Normal_0 is created.
Please let me know hoe to overcome this issue.

The below code I used to acheive this

public static Map<String, Integer> getAsposeCompareResult(
String sourceFileContent, String destFileContent, Map<String, Integer> asposeCompareResult) {
try {
LoadOptions loadOption = new LoadOptions();
loadOption.setLoadFormat(LoadFormat.WORD_ML);
Document srcDoc = new Document(“C:\Users\muthu\Desktop\Aspose\Original.xml”, loadOption);
Document destDoc = new Document(“C:\Users\muthu\Desktop\Aspose\Modified.xml”,loadOption);
String guid = null;
BookmarkCollection srcBookMarks = srcDoc.getRange().getBookmarks();
for (int i = 0; i < srcBookMarks.getCount(); i++) {
Bookmark srcBookMark = srcBookMarks.get(i);
String srcBookMarkName = srcBookMark.getName();
System.out.println(srcBookMarkName);
ArrayList srcExtractedNodesInclusive = extractContent(
srcBookMark.getBookmarkStart(),
srcBookMark.getBookmarkEnd(), true);
Document srcChoreoDoc = generateDocument(srcDoc,
srcExtractedNodesInclusive);
Bookmark destBookmark = destDoc.getRange().getBookmarks()
.get(srcBookMarkName);
if (destBookmark != null) {
ArrayList destExtractedNodesInclusive = extractContent(
destBookmark.getBookmarkStart(),
destBookmark.getBookmarkEnd(), true);
Document desChoreoDoc = generateDocument(destDoc,
destExtractedNodesInclusive);
srcChoreoDoc.compare(desChoreoDoc, “ChoreCompare”,
new Date());
if (srcChoreoDoc.getRevisions().getCount() == 0) {
System.out.println(destBookmark.getName()
+ " are equal");
} else {
System.out.println(destBookmark.getName()
+ " has change");
System.out.println(srcChoreoDoc.getRevisions()
.getCount());
for (Revision rev : srcChoreoDoc.getRevisions()) {
System.out.println(rev.getRevisionType());
if (rev.getRevisionType() == RevisionType.DELETION) {
System.out.println(“Term Deleted–>”
+ destBookmark.getName()
+ “Rev Node Type---->”
+ rev.getParentNode().getNodeType()
+ “---->”
+ rev.getRevisionType()
+ “------>”
+ rev.getParentNode().toString(
SaveFormat.HTML));
}

if (rev.getRevisionType() == RevisionType.FORMAT_CHANGE) {
System.out.println(“Term Format Change–>”
+ destBookmark.getName()
+ “Rev Node Type---->”
+ rev.getParentNode().getNodeType()
+ “---->”
+ rev.getRevisionType()
+ “------>”
+ rev.getParentNode().toString(
SaveFormat.HTML));
}

if (rev.getRevisionType() == RevisionType.STYLE_DEFINITION_CHANGE) {
System.out.println(“Term Style Change–>”
+ destBookmark.getName()
+ “Rev Node Type---->”
+ rev.getParentStyle().getName()
+ “---->” + rev.getRevisionType()
+ “------>”
+ rev.getParentStyle().getName());
}

if (rev.getRevisionType() == RevisionType.MOVING) {
System.out.println(“Term Moving–>”
+ destBookmark.getName()
+ “Rev Node Type---->”
+ rev.getParentStyle().getName()
+ “---->” + rev.getRevisionType()
+ “------>”
+ rev.getParentStyle().toString());
}

if (rev.getRevisionType() == RevisionType.INSERTION) {
System.out.println(“Term inserted–>”
+ destBookmark.getName()
+ “Rev Node Type---->”
+ rev.getParentNode().getNodeType()
+ “---->”
+ rev.getRevisionType()
+ “------>”
+ rev.getParentNode().toString(
SaveFormat.HTML));
}
}
}
} else {
}

System.out.println(“Revision Ended for Bookmark”);
}
} catch (Exception e) {
e.printStackTrace();
}

return asposeCompareResult;
}

Hi Muthu,


Thanks for your inquiry.

1) You can use Revision.ParentNode property to get the immediate parent node (owner) of this revision. This property will work for any revision type other than StyleDefinitionChange. Every Bookmark comprises of two nodes i.e. BookmarkStart and BookmarkEnd. You can determine if the ParentNode is enclosed inside these BookmarkStart and BookmarkEnd nodes. If yes, you can retrieve Bookmark object using BookmarkStart.Bookmark property.

2) Please check my reply in your other thread:
Style Definition Change Issue when comparing two DOCX documents

When using the KeepSourceFormatting option, it allows to make sure the imported text looks in the destination document exactly like it was in the source document. If a matching style already exists in the destination document, the source style is copied and given a unique name by appending a suffix number to it, for example “Normal_0” or “Heading 1_5”.

The drawback of using KeepSourceFormatting is that if you perform several imports, you could end up with many styles in the destination document and that could make using consistent style formatting in Microsoft Word difficult for this document.

When using the UseDestinationStyles option, if a matching style already exists in the destination document, the style is not copied and the imported nodes are updated to reference the existing style.

The drawback of using UseDestinationStyles is that the imported text might look different in the destination document comparing to the source document. For example, the “Heading 1” style in the source document uses Arial 16pt font and the “Heading 1” style in the destination document uses Times New Roman 14pt font. When importing text of “Heading 1” style with no other direct formatting, it will appear as Times New Roman 14pt font in the destination document.

Using KeepDifferentStyles option allows to reuse destination styles if the formatting they provide is identical to the styles in the source document. If the style in destination document is different from the source then it is imported.

If we can help you with anything else, please feel free to ask.

Best regards,

Thanks for your reply.

In order to find out the edited bookmarks, I am comparing each bookmark from source and destination document.
Also I found out that the style change issue is because of the original source xml is having the font locale 1024 and modified xml is having the valid locale id.

Hi Muthu,


Thanks for your inquiry. It is great you were able to find what you were looking for. Please let us know any time you have any further queries.

Best regards,

Hi,

One quick Question,
I have an xml file(Attachced Sample_1.xml), In this I have two bookmarks.
I am using extract content method to fetch the content inside each bookmark, but the problem is If the paragraph is have bookmark start and end its not extracting the entire paragraph content, its extracting only till the bookmark end in the paragraph.
Now I want to extract the entire paragraph content if I give the bookmark start and end

Please advice how to proceed this

Thanks,
Muthulakshmi

Hi Muthulakshmi,


Thanks for your inquiry. You can simply get parent paragraph of BookmarkStart/BookmarkEnd nodes using Node.GetAncestor method. Hope, this helps.

Best regards,