Compare of Images Issue between Aspose Versions

Using Aspose.Words for .Net I’m having an issue where images are being flagged as added/removed when run on a document generated on different versions of Aspose.

I’m in the process of upgrading to 24.1 but have hit an issue that I’ve tracked down to appearing between Aspose 22.9 and 22.10. If I run my program with my input on Aspose 22.9 I get the following output:
Image_22-9.DOCX (1.5 MB)

Then I upgrade to 22.10 (with no other changes) and run it again to get:
Image_22-10.DOCX (1.5 MB)

The files look the same, but running our compare code I get the following result:
Image_Compare.DOCX (1.5 MB)

From inspecting the xml content, I believe they are being flagged as different because the prior version resulted in <wp:inline distT="0" distB="0" distL="0" distR="0"> while the newer version results in <wp:inline>. I was able to confirm this by manually editing the 22.10 document and rerunning the compare. With the attributes added, the diff came back clean.

I also tried to track down where in our process the distT="0" distB="0" distL="0" distR="0" was getting added in 22.9. It seems to occur due to these lines of code:

HeaderFooter hf = builder.CurrentSection.HeadersFooters[extractType];
HtmlSaveOptions saveOptions = new HtmlSaveOptions();
// PrettyFormat preserves field code tags
saveOptions.PrettyFormat = true;
// ExportImagesAsBase64 makes copying of images work
saveOptions.ExportImagesAsBase64 = true;
saveOptions.SaveFormat = SaveFormat.Html;
string content = hf.ToString(saveOptions);

I’m looking for a way to either get the attributes back on the tag in the new version or to exclude that specific difference from the compare. Changes to the old version of the document will not be possible for our use case as they will have been generated previously. Any help would be greatly appreciated.

Below is the code we are using to do the compare:

Document initialDoc = new Document(InputFileName);
Document compareDoc = new Document(CompareFileName);
CompareOptions compareOptions = new CompareOptions
{
	IgnoreDmlUniqueId = true
};

WriteLogLine("Creating comparison document.");
initialDoc.Compare(compareDoc, "System", DateTime.Now, compareOptions);

Revision[] revs = initialDoc.Revisions.Cast<Revision>().ToArray<Revision>();
foreach (Revision rev in revs)
{
	RevisionType type = rev.RevisionType;
	switch (type)
	{
		case RevisionType.Insertion:
			rev.Author = "Adder";
			break;
		case RevisionType.Deletion:
			rev.Author = "Remover";
			break;
		case RevisionType.FormatChange:
			rev.Author = "Styler";
			break;
		case RevisionType.StyleDefinitionChange:
			rev.Accept();
			break;
		case RevisionType.Moving:
			rev.Author = "Relocator";
			break;
	}
}

if (string.IsNullOrEmpty(OutputFileName))
{
	if (initialDoc.Revisions.Count == 0)
	{
		OutputFileName = Path.Combine(Path.GetDirectoryName(InputFileName),
			Path.GetFileNameWithoutExtension(InputFileName)) + "_NoDiff." + outputExt;
	} else
	{
		OutputFileName = Path.Combine(Path.GetDirectoryName(InputFileName),
			Path.GetFileNameWithoutExtension(InputFileName)) + "_Compare." + outputExt;
	}
}

WriteLogLine("Saving comparison document to {0}.", OutputFileName);
initialDoc.Save(OutputFileName, outputFormat);

@bmentzer Aspose.Words compare mechanism works as expected. If compare Image_22-9.DOCX and Image_22-10.DOCX documents using MS Word the result is the same.

The distT="0" distB="0" distL="0" distR="0" attributes are not the only difference between shapes in the document. They also have different ids. So to get the clean comparison result, you should use the following code:

Document v1 = new Document(@"C:\Temp\Image_22-9.DOCX");
Document v2 = new Document(@"C:\Temp\Image_22-10.DOCX");

foreach (Shape s in v2.GetChildNodes(NodeType.Shape, true))
{
    s.DistanceBottom = 0;
    s.DistanceTop = 0;
    s.DistanceLeft = 0;
    s.DistanceRight = 0;
}

Aspose.Words.Comparing.CompareOptions opt = new Aspose.Words.Comparing.CompareOptions();
opt.IgnoreDmlUniqueId = true;

v1.Compare(v2, "test", DateTime.Now, opt);
v1.Save(@"C:\Temp\out.docx");

@alexey.noskov Thank you for the quick response.

The difference is caused by upgrading my Aspose version so it seems like there is something off about either the new way or the old way. I agree that compare is working correctly, but I believe the generation of the documents has a flaw and that flaw is only clear when using compare.

Regarding the workaround:
I can’t always be sure the images don’t have valid settings for DistanceBottom, DistanceTop, etc. and force them to 0. In the provided loop the Shape already thinks the value is 0 and I’m just setting it to 0 again. The below code results in two 0’s being printed to the console.

foreach (Aspose.Words.Drawing.Shape s in compareDoc.GetChildNodes(NodeType.Shape, true))
{
    Console.WriteLine(s.DistanceBottom);
    s.DistanceBottom = 0;
    Console.WriteLine(s.DistanceBottom);
}

Is there anyway to detect that s.DistanceBottom is unset versus actually set as zero? Otherwise, I think I can work around the issue by checking if its zero before setting it to zero.

Additionally, us there a way to find out which version of Aspose a document was generated with? That way I know when to do this workaround and when not to for both documents?

Best,
Bridgette

@bmentzer I am afraid there is no way to detect whether DistanceBottom, DistanceTop, DistanceLeft or DistanceRight is explicitly set or returns default/inherited value.

I am not sure what caused the different in internal document representation between the version of Aspose.Words, but we continuously work on improving our product and most likely the changes was caused by some fix made in the code. Unfortunately, it is impossible to keep an exact internal document representation between versions. But we are sure the output produced by Aspose.Words conforms the document format specification.
By the way documents generated by different versions of MS Word also might differ by their internal representation while visually they are the same.

There is no way to detect version of Aspose.Words used for the document generation. However, if unzip DOCX document and inspect document.xml you can find a comment like this:

<!-- Generated by Aspose.Words for .NET 22.10.0 -->