when we compare two pdf files and the resulting pdf has redlines to indicate the changes, how can we programmatically identify which pages has the redlines?
To programmatically identify which pages in a PDF document contain redlines after comparing two PDF files, you can utilize the Aspose.PDF library’s features for handling annotations and revisions. Here’s a step-by-step approach:
Cause: You need to identify pages with redlines in a PDF document resulting from a comparison of two PDF files.
Solution: Use the Aspose.PDF library to load the PDF, iterate through its pages, and check for specific annotations that indicate redlines.
Next Steps:
- Load the PDF Document: Use the
Document
class to load the resulting PDF that contains the redlines. - Iterate Through Pages: Loop through each page of the document to check for annotations or revisions.
- Check for Redlines: For each page, check if there are any annotations of type
TextAnnotation
orMarkupAnnotation
that indicate changes. You can also look for specific properties that denote redlines. - Store Page Numbers: Keep track of the page numbers where redlines are found.
Here’s a sample code snippet to illustrate this:
using Aspose.Pdf;
using Aspose.Pdf.Annotations;
using System.Collections.Generic;
public List<int> GetPagesWithRedlines(string pdfFilePath)
{
List<int> pagesWithRedlines = new List<int>();
Document pdfDocument = new Document(pdfFilePath);
for (int i = 1; i <= pdfDocument.Pages.Count; i++)
{
Page page = pdfDocument.Pages[i];
foreach (Annotation annotation in page.Annotations)
{
if (annotation is TextAnnotation || annotation is MarkupAnnotation)
{
// Assuming redlines are represented by specific annotation types
pagesWithRedlines.Add(i);
break; // No need to check further annotations on this page
}
}
}
return pagesWithRedlines;
}
This code will return a list of page numbers that contain redlines. Make sure to adjust the conditions based on how redlines are represented in your specific PDF documents.
If you have any further questions or need additional assistance, feel free to ask!
how would i do this for docx?
Aspose.Words is the API which enables you to process .docx files. We are moving the inquiry to respective forum category where you will be assisted accordingly.
@randomuser123 Aspose.Words comparison works the same way as MS Word’s comparison, i.e. detected differences are shown in the output document as revisions. Please see our documentation for more information:
https://docs.aspose.com/words/net/compare-documents/
You can get revisions after comparing documents to get comparison summary. See Document.Revisions property.
You can use LayoutCollector to get page number where a particular node is located. For example see the following code:
Document doc1 = new Document(@"C:\Temp\file1.docx");
Document doc2 = new Document(@"C:\Temp\file2.docx");
doc1.Compare(doc2, "test", DateTime.Now);
// Create LayoutCollector and get page numbers where the revisions are located.
LayoutCollector collector = new LayoutCollector(doc1);
foreach (Revision r in doc1.Revisions)
{
int page = collector.GetStartPageIndex(r.ParentNode);
Console.WriteLine($"Page: {page}; Type: {r.RevisionType}; Author: {r.Author}; Text: '{r.ParentNode.ToString(SaveFormat.Text).Trim()}'");
}
Thank you alexey, that is working!
i am trying to do docx comparison
var stream = new MemoryStream();
Document oldDocument = new Document(oldFile);
Document newDocument = new Document(newFile);
Aspose.Words.Comparing.CompareOptions compareOptions = new CompareOptions
{
Target = ComparisonTargetType.Current
};
newDocument.AcceptAllRevisions();
oldDocument.AcceptAllRevisions();
oldDocument.TrackRevisions = false;
newDocument.TrackRevisions = false;
oldDocument.Compare(newDocument, author, DateTime.UtcNow, compareOptions);
oldDocument.Save(stream, SaveFormat.Docx);
stream.Position = 0;
return stream;
but im getting an error saying ‘NC sync failed’
i am using words 23.3.0, and compare using word is working as expected.
can you please explain what this error means?
@randomuser123 Could you please attach the problematic input documents here for testing? We will check the issue on our side and provide you more information.
I tried using this method to identify the redlines in a pdf with text or markup annotation but the annoations are of type link. can we assume all redlines to be of type link?
@randomuser123 Aspose.Words mimics MS Word behavior when export revisions to PDF. The goal is to provide the visual representation of revisions the same way as they are shown in MS Word.