I am trying to compare two documents(docx,pdf) and get the coordinates of the changes in the initial documents. How can this be achieved?
- Can I get the coordinates of the changes from the pdf?
- Is docx to pdf conversion highly acccurate?
I am trying to compare two documents(docx,pdf) and get the coordinates of the changes in the initial documents. How can this be achieved?
Sure, I can help you with your questions regarding Aspose.PDF and Aspose.Words for Java.
Aspose.PDF for Java allows you to manipulate PDF documents and extract text along with its position (coordinates) from the document. However, directly extracting coordinates for detected changes requires additional logic.
Here are the steps you can follow:
Here is a basic example to guide you through the process:
import com.aspose.words.*;
import com.aspose.pdf.*;
public class DocumentComparison {
public static void main(String[] args) {
// Compare DOCX documents
Document docOriginal = new Document("original.docx");
Document docRevised = new Document("revised.docx");
Document docResult = docOriginal.compare(docRevised, "John Doe", DateTime.now());
// Convert the result to PDF
docResult.save("document_result.pdf", SaveFormat.Pdf);
// Extract text with coordinates from the resulting PDF
PdfDocument pdfDocument = new PdfDocument("document_result.pdf");
for (int pageCount = 1; pageCount <= pdfDocument.getPages().size(); pageCount++) {
Page page = pdfDocument.getPages().get(pageCount - 1);
TextFragmentCollection textFragments = page.getTextFragments();
for (TextFragment textFragment : textFragments) {
String text = textFragment.getText();
Rectangle rect = textFragment.getRect();
float x = rect.getLLX();
float y = rect.getLLY();
float width = rect.getWidth();
float height = rect.getHeight();
System.out.println("Text: " + text + ", X: " + x + ", Y: " + y + ", Width: " + width + ", Height: " + height);
}
}
}
}
Yes, Aspose.Words provides a highly accurate DOCX to PDF conversion, maintaining the formatting and layout of the original document.
For more details and comprehensive examples, please refer to the official documentation:
If you have any specific requirements or run into issues with the code snippets provided, please let me know, and I can offer more detailed assistance.
From aspose.pdf can we get the coordinates of the changes from the initial file and not from the result file?
Regarding Aspose.PDF, we need to investigate this requirement in details. Would you kindly share your sample file(s) along with generated outputs? Also, please share the sample code snippet for our reference as well. We will log an investigation ticket and share the ID with you.
The requirement is I have two document on which I have to do comparison on. There are two input files. After comparison - I need to get the changes along with their coordinates. Something of this sort:
{
"x": 1.333299994468689,
"width": 1.333299994468689,
"y": 1.333299994468689,
"text": "i",
"type": "Deleted",
"height": 1.333299994468689
},
Is there a way Aspose.pdf provides does comparison? And will provide me the accurate coordinates of the changes in the original input files such that I will be able to accurately highlight changes in my frontend application?
The file comparison feature in the Aspose.PDF API is still in its early stages, and we are actively working to enhance its functionality. To better understand and evaluate your requirements, we kindly request you to share the sample files and code snippet mentioned in our previous response. Once we receive this information, we will proceed with further investigation accordingly.