How to compare two pdf files content using Aspose.pdf?
I am afraid currently Aspose.Pdf for .NET does not support the feature to compare the contents of two PDF documents. However if both PDF documents are text based, you can extract the text from individual PDF files and try using some third party components to compare these contents.
However I have also logged this requirement in our issue tracking system as PDFNEWNET-31260 and have asked our development team to further investigate that either we can support this feature or not. Please be patient and spare us little time.
can we compare two pdf documents by converting into bytes?i mean can we compare in bytes?
Thanks for your acknowledgement.
I have further discussed this requirement development team and as per our current understanding, I am pleased to share that we can support this requirement. So when the two documents are compared, a resultant (xml) report will be generated and the comparison result will not only display that the two documents are equal, i.e. simply “YES” or “NO” but instead the output report may contain something information like “page 1 – the level of textual correspondence 70%” etc. If page 1 from document 1 has X significant words and page 1 from document 2 has Y significant words, we can count how many significant words are equal (for instance = Z). Then the level of textual correspondence will be: Z / max(X, Y).
Or you need something like if page X of document 1 looks like page X of document 2 (i.e. related page images are similar), text and forms inside page X of document 1 contain the same information as text/forms of the same page of document 2, or maybe something else… Please share more details.
Please note that we have some ideas related to non-textual contents comparison we would like to have your words over this requirement, because generally speaking, documents comparison task is a bit complex. Notice that even two PDF documents which are totally same, can have a lots of differences when we compare them byte by byte
Can you supply an example on how to do this?
is there any possible to compare 2 PDF files and highlight the differences in separate PDF file in aspose PDF java?