Aspose word document revisions inconsistent behaviour

Hi,

I’m comparing 2 word documents and i would like to keep the differences between the documents as revisions for the final user but i’m having some very inconsistent behavior with them, the code basically looks like this:

@SuppressWarnings("unchecked")
public static void main(String[] args) throws Exception {
    Document doc1 = new Document("C:\\AsposeTest\\" + "docA.docx");
    Document doc2 = new Document("C:\\AsposeTest\\" + "docB.docx");

    RevisionOptions revisionOptions = getRevisionOptions(doc1);
    doc1.compare(doc2, "Me", DateTime.now().toDate());

    //FILE 1
    String fileLocationName = "C:\\AsposeTest\\" + "1. A_B_Compared.docx";

    doc1.save(fileLocationName, SaveFormat.DOCX);

    File file = new File(fileLocationName);
    Document recoveredDoc = new Document(new ByteArrayInputStream(FileUtils.readFileToByteArray(file)));
    //FILE 2
    recoveredDoc.save("C:\\AsposeTest\\" + "2. A_B_Compared Recovered.docx", SaveFormat.DOCX);
}

private static RevisionOptions getRevisionOptions(Document document) {
    RevisionOptions revisionOptions = document.getLayoutOptions().getRevisionOptions();
    revisionOptions.setShowOriginalRevision(false);
    revisionOptions.setShowRevisionMarks(true);
    revisionOptions.setRevisionBarsWidth(5);
    revisionOptions.setInsertedTextEffect(RevisionTextEffect.NONE);
    return revisionOptions;
}

With FILE 1, it shows the revisions with the differences correctly in a very specific case

  • On an unactivated verision of Office 365 and only if i have another Word file open, even if its a blank document, it will show them like i want to have them:

On an activated version of Office 2016, it will be showed like the image below, losing most of the data and what is left is comparing it with its own data.

The document itself has the data on its structured document tag, here’s how one looks:

And for FILE 2, i’m simulating saving into a java io FILE since thats how my DB works, it uses File to save it into a blob.

In this case the data always looks like the second case above, with the differences lost and the only ones left comparing them to itself. And its Structured document tag also look different.

Is there a better way to achieve the behavior i want? am i doing something wrong or silly?

Thank you.

@lconde57 Could you please attach your input and output documents here for testing? We will check the issue and provide you more information.
Also, in your code you set document.getLayoutOptions().getRevisionOptions() but save the document in DOCX format. document.getLayoutOptions().getRevisionOptions() has effect only when the document is rendered to Fixed Page formats, like PDF or XPS.

Thanks for the help.

I found a way to achieve what i wanted, by removing the xml mapping, but i would rather not do that since i’ll need it for placeholder management via word add in.

These are the files: AsposeTest.7z (57.3 KB)

This is the code:

import org.joda.time.DateTime;

import com.aspose.words.Document;
import com.aspose.words.Node;
import com.aspose.words.NodeType;
import com.aspose.words.Revision;
import com.aspose.words.RevisionCollection;
import com.aspose.words.SaveFormat;
import com.aspose.words.StructuredDocumentTag;

public class AsposeTest2 {

	@SuppressWarnings("unchecked")
	public static void main(String[] args) throws Exception {
		Document doc1 = new Document("C:\\AsposeTest\\" + "docA.docx");
		Document doc2 = new Document("C:\\AsposeTest\\" + "docB.docx");

		Document doc1Clone =  doc1.deepClone();
		doc1.compare(doc2, "Me", DateTime.now().toDate());

		//FILE 1 WITHOUT MAPPING REMOVED
		doc1.save("C:\\AsposeTest\\" + "FILE1 HAS MAPPING WRONG DIFFERENCES.docx", SaveFormat.DOCX);

		//FILE 2 WITH MAPPING REVOED
		removeMapping(doc1); //WOULD PREFER NOT DO THIS
		doc1.save("C:\\AsposeTest\\" + "FILE2 NO MAPPING CORRECT DIFFERENCES.docx", SaveFormat.DOCX);

		//FILE 3 EXTRA DIFFERENCE
		removeSdt(doc1Clone);
		removeSdt(doc2);
		doc1Clone.compare(doc2, "Me", DateTime.now().toDate());

		RevisionCollection revisions = doc1Clone.getRevisions();
		for (int i = 0; i < revisions.getCount(); i++) {
			Revision r = revisions.get(i);
			Node node = r.getParentNode();
			String temp = node.getText();
			System.out.println(temp + "\r");
		}

		doc1Clone.save("C:\\AsposeTest\\" + "FILE3 EXTRA DIFFERENCE.docx", SaveFormat.DOCX);
	}

	@SuppressWarnings("unchecked")
	private static void removeMapping(Document doc) {
		for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
			try {
				if (sdt.getXmlMapping().isMapped()) {
					sdt.getXmlMapping().delete();
				}
			} catch (Exception e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
		}
	}

	@SuppressWarnings("unchecked")
	private static void removeSdt(Document doc) {
		for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
			try {
				sdt.removeSelfOnly(); // Removes just this SDT node itself, but keeps the content of it inside the document tree
			} catch (Exception e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
		}
	}
}

I added one extra issue, for FILE 3, why is the last line “70,000.20” being highlited as difference even tough there is no difference between the two documents?

Thank you.

@lconde57 Thank you for additional information. As I can see you are using old 21.4 version of Aspose.Words for Java. I have checked your scenario using the latest 22.9 version and the result looks correct. The simple code for testing is the following:

Document doc1 = new Document("C:\\Temp\\docA.docx");
Document doc2 = new Document("C:\\Temp\\docB.docx");

doc1.compare(doc2, "Test", new Date());

String fileLocationName = "C:\\Temp\\out.docx";
doc1.save(fileLocationName);

Document recoveredDoc = new Document(new ByteArrayInputStream(Files.readAllBytes(Paths.get(fileLocationName))));
recoveredDoc.save("C:\\Temp\\out2.docx", SaveFormat.DOCX);

The output documents look the same on my side. Comparing document.xml files from the output document shows difference only in SDT storeItemChecksum values.

Checking the same scenario using 21.4 version of Aspose.Words shows difference in document.xml from the output documents. So it looks like there is an issue with your documents in old 21.4 version of Aspose.Words. Here are output documents produced by the latest version: out2.docx (17.8 KB) out.docx (17.8 KB)

The extra difference appear because after rooving SDTs in one of document there is an empty paragraph after 70,000.20. You can save intermediate document to see this.

Thanks for the answers, they were really helpful. One last thing: could you tell me why the word “Liability” is getting marked on the changes of these files: Test2.7z (39.4 KB)
Code:

import org.joda.time.DateTime;

import com.aspose.words.Document;
import com.aspose.words.Node;
import com.aspose.words.NodeType;
import com.aspose.words.Revision;
import com.aspose.words.RevisionCollection;
import com.aspose.words.SaveFormat;
import com.aspose.words.StructuredDocumentTag;

public class AsposeTest3 {

	@SuppressWarnings("unchecked")
	public static void main(String[] args) throws Exception {
		Document doc1 = new Document("C:\\AsposeTest\\" + "docA.docx");
		Document doc2 = new Document("C:\\AsposeTest\\" + "docB.docx");

		removeMapping(doc1);
		removeMapping(doc2);
		removeSdt(doc1);
		removeSdt(doc2);
		doc1.compare(doc2, "Me", DateTime.now().toDate());
		doc1.save("C:\\AsposeTest\\" + "COMPARE RESULT.docx", SaveFormat.DOCX);

		RevisionCollection revisions = doc1.getRevisions();
		for (int i = 0; i < revisions.getCount(); i++) {
			Revision r = revisions.get(i);
			Node node = r.getParentNode();
			String temp = node.getText();
			System.out.println(temp);
		}
	}

	@SuppressWarnings("unchecked")
	private static void removeMapping(Document doc) {
		for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
			try {
				if (sdt.getXmlMapping().isMapped()) {
					sdt.getXmlMapping().delete();
				}
			} catch (Exception e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
		}
	}

	@SuppressWarnings("unchecked")
	private static void removeSdt(Document doc) {
		for (StructuredDocumentTag sdt : (Iterable<StructuredDocumentTag>)doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true)) {
			try {
				sdt.removeSelfOnly(); // Removes just this SDT node itself, but keeps the content of it inside the document tree
			} catch (Exception e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
		}
	}
}

If you check Console, it prints Liability even tough is not part of the changes:

CPA
xxx
Liability



Add a new sentence from the supplier

@lconde57 In this case parent node of the revision is Paragraph and node.getText() method returns text of whole paragraph: