Show similarities

After comparing two pdfs in Aspose.words, I want to draw the similarities, not the differences, how do I do it? (in python)

@sedaozdmiir Could you please attach your sample documents and provide the expected output?
Upon comparing documents Aspose.Words, like MS Word, marks the differences with revisions. So you can accept Delete revisions and reject Insert revisions in the resulting document. The remaining content is the content that is the same in both source documents.

I would like to point out the similar places instead of the revisions that aspose.words crossed out in the resulting document after comparing the documents

@sedaozdmiir Could you please elaborate in what format you would like to get the output? It would be great if you provide you sample input and expected output documents. You can create document in MS Word, just for demonstration purposes. We will check your requirements and provide you more information.

1.pdf (351.1 KB)
2.pdf (358.4 KB)
After comparing two pdfs I want similar places to be plotted in both PDFs. It should be saved in PDF format.

@sedaozdmiir Thank you for additional information. Unfortunately, there is no built-in way to achieve this, However, you can hide inserted and deleted text by specifying revision color to white:

doc1 = aw.Document("C:\\Temp\\1.pdf");
doc2 = aw.Document("C:\\Temp\\2.pdf");

doc1.compare(doc2, "test", datetime.date.today());

# Configure to show deleted and inserted text with white color.
# So the text is not visible in the output PDF.
doc1.layout_options.revision_options.deleted_text_color = aw.layout.RevisionColor.WHITE;
doc1.layout_options.revision_options.inserted_text_color = aw.layout.RevisionColor.WHITE;
# also hire revision bar.
doc1.layout_options.revision_options.show_revision_bars = False

doc1.save("C:\\Temp\\out.pdf");

In this case the output document will look like this: out.pdf (51.4 KB)

Revised text is still there, but it is the same color as page.

Also, please note, Aspose.Words is designed to work with MS Word documents. MS Word documents are flow documents and they have structure very similar to Aspose.Words Document Object Model. But on the other hand PDF documents are fixed page format documents . While loading PDF document, Aspose.Words converts Fixed Page Document structure into the Flow Document Object Model. Unfortunately, such conversion does not guaranty 100% fidelity.
Regarding PDF document comparison using Aspose.Words, though PDF document might look the same visually, their structure might be different, that leads into the different DOM build by Aspose.Words and as a result the differences in document comparison. Aspose.Words comparison works similar to MS Word document compare feature. Which compares more internal document representation than the visual document appearance.

Dear alexey.noskov,
Thank you so much for the help.

1 Like