Compare DOCX with Images & Save to PDF C# .NET | Unique Picture IDs in Word Document Cause Deletion & Insertion Revisions

Hi,

we noticed an issue where unchanged images are displayed as changed if we compare documents, which have been saved with Aspose before and are duplicated if saved to PDF.

With our processing in general, we use the original documents, change headers / footers and update fields and save them again. A customer noticed, that those ‘updated’ documents have duplicated images when comparing them.
I already analyzed a bit and it looks like it already is enough to save them once with Aspose to cause those duplicated issues. Please see the attached example with following test code:

            var lic = new License();
            lic.SetLicense(@"S:\Aspose.Total.lic");

            // ========================================================
            // Comparing original files work good
            // ========================================================
            var docold = new Document(@"S:\tmp\comp\ver0.docx"); // original revision 0
            var docnew = new Document(@"S:\tmp\comp\ver1.docx"); // original revision 1

            docold.Compare(docnew, "test", DateTime.Now); // compare them
            docold.Save(@"S:\tmp\comp\compared_directly.docx"); // save the comparison -> looks fine
            docold.Save(@"S:\tmp\comp\compared_directly.pdf"); // pdf version -> looks fine

            // ========================================================
            // Comparing files saved with aspose duplicated images
            // ========================================================
            docold = new Document(@"S:\tmp\comp\ver0.docx"); // original revision 0
            docnew = new Document(@"S:\tmp\comp\ver1.docx"); // original revision 1

            // change header / footers, update fields, etc.

            docold.Save(@"S:\tmp\comp\old.docx"); // save updated old revision
            docnew.Save(@"S:\tmp\comp\new.docx"); // save updated new revision

            // compare updated versions
            docold = new Document(@"S:\tmp\comp\old.docx"); // load updated old revision
            docnew = new Document(@"S:\tmp\comp\new.docx"); // load updated new revision

            docold.Compare(docnew, "test", DateTime.Now); // compare them
            docold.Save(@"S:\tmp\comp\compared_after_save.docx"); // save the comparison -> images are displayed in changes
            docold.Save(@"S:\tmp\comp\compared_after_save.pdf"); // pdf version -> images are duplicated

example.zip (279.5 KB)
Files in example:
ver0.docx -> original document old revision
ver1.docx -> original document new revision
compared_directly.docx -> ver0.docx and ver1.docx compared with Aspose -> ok
old.docx -> ver0.docx saved again with Aspose
new.docx -> ver1.docx saved again with Aspose
compared_after_save.docx -> old.docx and new.docx compared with Aspose -> contains changes in images but images have not been changed
compared_after_save.pdf -> compared_after_save.docx saved as pdf with Aspose
compared_after_save_made_with_word2016.pdf -> compared_after_save.docx saved as pdf with Word 2016

In the saved docx file the images appear in the changes and in the PDF version they just are duplicated.
Imo these are two bugs combined:

  1. Unchanged images are in the changes after document has been saved with aspose
  2. PDF Version of comparison displays the changed image in the content and not in that ‘sidebar’ with the changes. If you open the wrong docx comparison with Word 2016 and save the PDF manually, the images are shown in the sidebar, see the attached files.

Kind regards,
Daniel

@Serraniel,

We have logged this problem in our issue tracking system with ID WORDSNET-20997. We will further look into the details of this problem and will keep you updated on the status of linked issue. We apologize for your inconvenience.

Please try the Show In Balloons Property of Revision Options class. Sample C# code is as follows:

...
...
docold.Compare(docnew, "test", DateTime.Now); // compare them
docold.Save(@"C:\\Temp\\example\\compared_after_save.docx"); // save the comparison -> images are displayed in changes
docold.LayoutOptions.RevisionOptions.ShowInBalloons = ShowInBalloons.FormatAndDelete; // should fix the second issue. Customer has been informed
docold.Save(@"C:\\Temp\\example\\compared_after_save.pdf"); // pdf version -> images are duplicated

Thanks for your reply and having a look into the first issue. I wasn´t aware there is an option for this pdf layout type, we will try this one out.

@Serraniel,

Sure. We will also inform you here as soon as the linked issue (WORDSNET-20997) will get resolved or any more updates may be available in future.

The issues you have found earlier have been fixed in this Aspose.Words for .NET 20.12 update and this Aspose.Words for Java 20.12 update.