Pdf Incremental Updates

Hello,


In PDF spec we have Incremental Updates feature.

PDFBox already has this API: PDDocument (PDFBox reactor 2.0.2 API). Do we have the similar API in Aspose?

What I would like to do is to add an annotation to pdf without changing the old part of the document (similar to Adobe when you add a comment to pdf). Thank you.

Best Regards,
Tuyen

Hi Tuyen,


Thanks for contacting support.

Yes, you can add an annotation to PDF without changing the old part of the document. Please see following code snippet

JAVA

Document document = new Document(dataDir + “pdf-sample.pdf”);
TextAnnotation text2 = new TextAnnotation(document.getPages().get_Item(1), new Rectangle(200, 500, 220, 550));
text2.setName (“Text annot with manual popup”);
text2.setTitle (“This title not in popup”);
text2.setContents (“Description: Text annot with manual popup annotation”);
text2.setIcon (TextIcon.Comment);
text2.setColor (Color.getBlue());

document.getPages().get_Item(1).getAnnotations().add(text2);
PopupAnnotation popup2 = new PopupAnnotation(document.getPages().get_Item(1), new Rectangle(100, 400, 400, 500));
popup2.setName (“Manual popup”);
popup2.setContents (“Description: Manual popup annotation”);
popup2.setColor (Color.getGreen());
text2.setPopup (popup2);
document.getPages().get_Item(1).getAnnotations().add(popup2);
document.save();

If you need further assistance, please feel free to contact us.

Best Regards,

Hi Fahad,


Thanks for your response.

I think you misunderstand my 1st post. What I mean is the binary content, not the content like what you see in any PDF Reader Software (like Adobe).

When you try to run your code, you can see that binary content is modified a lot (you can use any Compare Tool to see it).

Instead, if you use Adobe to add a comment (annotation) to a pdf file, it does not modify binary content of the old part. Instead, it add new contents after %%EOF (if you open that pdf in a text editor). Again, you can use a Compare Tool to see it.

As noted, PDF has a specification for incremental update and PDFBox has saveIncremental API that does the job. Please let me know if Aspose has similar API. Thank you.

Best Regards,
Tuyen

Hi Tuyen,


Thanks for sharing further details.

I am looking into it and will share my findings with you shortly.

Best Regards,

Hi Tuyen,


Please note that the document designed in such a way to edit only annotations and forms. So we can do some allowed changes with keeping extended rights by using file paths. I have used the Adobe Acrobat DC compare tool with original file and the modified files and it shows 0 changes except 1 annotation is added. I have attached files and the compare results for your reference.

If you need further assistance, please feel free to contact us.

Best Regards,

Hi Fahad,


Thanks for your response, it helped.

If I load a pdf to Document directly, add annotations to it, then save that Document, it worked (it’s really an incremental update which does not break my signed pdf signature).

If I load a pdf to binary, load a pdf Document from that binary, add annotations to it, then save that Document to a ByteArrayOutputStream, then get an output binary, save it to a new file, it doesn’t work. When I made a comparation, binary content is totally changed, and signatures in my pdf were totally broken!

As mostly I work on processing binary content, can you help me with this? Thank you.

Best Regards,
Tuyen

Hi Tuyen,


Thanks for sharing more details.

I have tested the scenario with following code snippet using Aspose.Pdf for Java 17.2.0 and did not notice any issue. I checked the binary comparison and the it was fine as well. I have also attached all files for your reference.

JAVA

String imageName = “pdf-sample-test.pdf”;
File file = new File(dataDir, imageName);
byte[] dataBinary = null;
try {
dataBinary = Files.readAllBytes(file.toPath());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Document document = new Document(new ByteArrayInputStream(dataBinary));
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
TextAnnotation text2 = new TextAnnotation(document.getPages().get_Item(1), new Rectangle(200, 500, 220, 550));
text2.setName (“Text annot with manual popup”);
text2.setTitle (“This title not in popup”);
text2.setContents (“Description: Text annot with manual popup annotation”);
text2.setIcon (TextIcon.Comment);
text2.setColor (Color.getBlue());

document.getPages().get_Item(1).getAnnotations().add(text2);
PopupAnnotation popup2 = new PopupAnnotation(document.getPages().get_Item(1), new Rectangle(100, 400, 400, 500));
popup2.setName (“Manual popup”);
popup2.setContents (“Description: Manual popup annotation”);
popup2.setColor (Color.getGreen());
text2.setPopup (popup2);
document.getPages().get_Item(1).getAnnotations().add(popup2);
document.save(outputStream);

byte[] outputBinary = outputStream.toByteArray();
Files.write(file.toPath(), outputBinary);

If you still face any issue, please share your sample code along with input / output file. It will help us to replicate the issue on our end and address it accordingly.

We are sorry for the inconvenience.

Best Regards,

Hi Fahad,


Thanks for your response. I don’t see that you attach any files in your last post.

I already tried your latest code and unfortunately it does not work, binary content is changed a lot. I tried with the “pdf-sample.pdf” file you sent in the past: the original document is a 8KB, after I run your code it reduced to 7KB.

I used the same Apose Java PDF version 17.2.0.

My other environment item is: jdk1.8.0_71, Windows 8.

Best Regards,
Tuyen

Hi Tuyen,


Sorry for missing the attachment. I have attached the files in my previous post. Also I am attaching the compare tool report with this post for your reference. I will appreciate if you please share your sample code along with sample PDF file. I will try to replicate the issue in Windows. I am using jdk1.8.0_72, MAC OS.

We are sorry for the inconvenience caused.

Best Regards,

Hi Fahad,


Thanks for your quick response. I already compared your 2 file “pdf-sample-test.pdf” and “pdf-sample+copy+2.pdf”. If you use WinMerge to compare them, you can see a lot of difference in the binary.

Best Regards,
Tuyen

Hi Tuyen,


Thanks for sharing further details.

Please note our output is based on PDF specifications from Adobe so please try using Adobe Acrobat Compare File tool to compare the files instead of WinMerge.

We are sorry for the inconvenience.

Best Regards,

Hi Fahad,


Thanks for your information. I would like to summary the situation we have:

1. PDF has a specification for Incremental Update, this is the specification of PDF, not Adobe
2. We can confirm that Incremental Update feature by a binary/text comparing tool, WinMerge is a good one.
3. Adobe has this Incremental Update feature: when you add a “sticky note” to a pdf, if you use WinMerge to compare the old and new file, you’ll see that it does not change the content of old file, it only adds a new part (so called “Incremental Update” feature)
4. PDFBox has a PDF for this Incremental Update feature: saveIncremental
5. Aspose PDF Java also has this Incremental Update feature if we load a document from a file path directly, then add some annotations, then save to that same file
6. Aspose PDF Java: If we load a document from binary, then add some annotations, then save to a different binary, the content is totally changed (not Incremental Update)

Can you look into the difference between item 5) and 6)? If there is no way to change the current behavior, can you consider adding a “flag” to your API in the next version so that Incremental Update works in case 6) above?

Best Regards,
Tuyen

Hi Tuyen,


Thanks for sharing further details.

Aspose PDF Java: If we load a document from binary, then add some annotations, then save to a different binary, the content is totally changed (not Incremental Update)

I have tested the scenario using Adobe Compare tool and it is treating it as Incremental Update. However in WinMerge results are different. I have logged an enhancement ticket as PDFJAVA-36627 in our issue tracking system. We will further look into the details of this enhancement and will keep you posted on the status of correction. Please be patient and spare us little time.

We are sorry for this inconvenience.

Best Regards,
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial; -webkit-text-stroke: #000000} span.s1 {font-kerning: none}


Hi Fahad,


Thank you for looking into this. For your information, to test this Incremental Update, we can use a signed pdf document.

If the modification breaks the signature, it’s not an Incremental Update. Otherwise, it’s what I expected.

As long as the modification doesn’t break signatures of a signed pdf document, that’s Incremental Update.

Adobe already had a good manipulation of this Incremental Update: if we add annotations to a signed pdf document using Adobe, it does not break any signatures of that document. We would like to achieve the same thing with Aspose PDF.

Best Regards,
Tuyen

HI Tuyen,


Thanks for sharing further details.

I will test above scenario and will share my findings with you shortly.

Best Regards,

Hi Tuyen,


I have tested the scenario with one sample signed PDF document and compared the result using Adobe Compare tool. It is looking fine to me. I have also attached the comparison report and original PDF files for your reference.

If your requirement is different as per my understanding then I will appreciate if you please share your sample PDF file along with the code. It will help us to understand your requirement exactly and address it accordingly.

We are sorry for the inconvenience.

Best Regards,

Hi Fahad,


Thanks for your response. I’m sorry that we are still not on the same page and I hope this is more clear to you with this new information.

I have a common method to add annotation like this:

private static void addAnnotationToDocument(Document document) {
TextAnnotation text2 = new TextAnnotation(document.getPages().get_Item(1), new Rectangle(200, 500, 220, 550));
text2.setName (“Text anno name”);
text2.setTitle (“Text anno title”);
text2.setContents (“Text anno content”);
text2.setIcon (TextIcon.Comment);
text2.setColor (Color.getBlue());

document.getPages().get_Item(1).getAnnotations().add(text2);
PopupAnnotation popup2 = new PopupAnnotation(document.getPages().get_Item(1), new Rectangle(100, 400, 400, 500));
popup2.setName (“Popup anno name”);
popup2.setContents (“Popup anno content”);
popup2.setColor (Color.getGreen());
text2.setPopup (popup2);
document.getPages().get_Item(1).getAnnotations().add(popup2);
}

When I load a signed document directly from a file, call the above method, save to the same file, the signature is not broken which is good (see signed_doc_aspose_good.pdf)

Document document = new Document(dataDir + File.separatorChar + fileName);
addAnnotationToDocument(document);
document.save();

However, when I load a document to binary then create pdf document from there, then add annotation using the same shared api, then save to an output binary, then save to a file, the signature is broken which is bad (see signed_doc_aspose_bad.pdf).

File file = new File(dataDir, pdfFileName);
byte[] dataBinary = null;
try {
dataBinary = Files.readAllBytes(file.toPath());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Document document = new Document(new ByteArrayInputStream(dataBinary));
final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
addAnnotationToDocument(document);
document.save(outputStream);

byte[] outputBinary = outputStream.toByteArray();
Files.write(file.toPath(), outputBinary);

For your reference in my attachments:
+ signed_doc_org.pdf: original signed pdf
+ signed_doc_adobe_always_good.pdf: I used Adobe to add an annotation to the file, signature is not broken
+ signed_doc_aspose_good.pdf: my first case
+ signed_doc_aspose_bad.pdf: my second case

When you open my attachments, for good cases, it said that “Signed and and all signatures are valid, but with unsigned changes after the last signature.”. For bad cases, it said that "At least one signature is invalid."

I figured out Incremental Update is the key point here, all good cases have this feature, bad case does not. And WinMerge is my tool to check it (for Incremental Update: old content of signed pdf is not changed, new content is added only)

I would like to know if there is anyway to get through the “bad” case above except for saving to a temporary file. If not, can we consider to fix it in the next version?

Please let me know if this is still unclear to you. Thanks.

Best Regards,
Tuyen

Hi Tuyen,


Thanks for sharing the details.

I have tested the scenario and have managed to reproduce the problem that while loading a document to binary then adding annotation invalidates the signature. For the sake of correction, I have logged a ticket PDFJAVA-36629 in our issue tracking system. We will further look into the details of this problem and will notify you on the status of its resolution within this forum thread. Please be patient and spare us little. We are sorry for this inconvenience.

Best Regards,

@vutuyen2636

Thanks for your patience.

We are pleased to inform you that earlier reported issues PDFJAVA-36627 and PDFJAVA-36629, have been resolved in latest version Aspose.Pdf for Java 17.12. We have implemented an additional method saveIncrementally(..) into the Document class and now you will be able to save document using Incremental Updates into the stream object. Please replace document.save(outputStream); code line with document.saveIncrementally(outputStream); in your code snippet to avoid the issue.

Please try suggested method with latest release version and in case you face any issue, please feel free to contact us.