PDF/A-1 conversion creates invalid XRef table

Hello,


When converting document to PDF/A the resultant document contains invalid XRef table, which contains gap in references. This causes problems in our software when we sign the document several times (then only the last signature is valid and the rest becomes invalid).


Here is what happens during conversion:


XRef table in the original document looks like this:
xref
0 7
0000000000 65535 f
0000000257 00000 n
0000000015 00000 n
0000000345 00000 n
0000000145 00000 n
0000000396 00000 n
0000000441 00000 n
trailer
 
The conversion creates the following Xref table:
xref
0 15
0000000000 65535 f
0000000018 00000 n
0000000107 00000 n
0000000278 00000 n
0000000330 00000 n
0000000481 00000 n
0000000577 00000 n
0000000727 00000 n
0000000750 00000 n
0000000771 00000 n
0000001040 00000 n
0000001227 00000 n
0000001297 00000 n
0000001521 00000 n
0000001649 00000 n
17 1
0000002323 00000 n
19 2
0000002547 00000 n
0000002609 00000 n
22 4
0000031189 00000 n
0000031970 00000 n
0000032086 00000 n
0000032310 00000 n
trailer


There are 3 gaps (those shorter lines), becasue objects 15, 16, 18, 21 do not exist. According to the PDF specification (section 7.5.4 in ISO32000-1) there should be no gaps in the Xref tables, when put all together:
The cross-reference table (comprising the original cross-reference section and all update sections) shall contain one entry for each object number from 0 to the maximum object number defined in the file, even if one or more of the object numbers in this range do not actually occur in the file.


As this is the first and only XRef table, there shouldn't be any gaps in it. Those gaps make this PDF invalid. When Reader opens PDF it tries to repair it in memory, which is why you can open this PDF in Reader. However, signature validation happens on the original PDF's bytes and it uses the original, not the repaired XRef table, hence the signature validation ends with an error.


When we manually correct the XRef table, the signing with multiple signatures works fine. Here is an example how it can be fixed (by marking those objects as free):
xref
0 26
0000000015 65535 f
0000000018 00000 n
0000000107 00000 n
0000000278 00000 n
0000000330 00000 n
0000000481 00000 n
0000000577 00000 n
0000000727 00000 n
0000000750 00000 n
0000000771 00000 n
0000001040 00000 n
0000001227 00000 n
0000001297 00000 n
0000001521 00000 n
0000001649 00000 n
0000000016 00000 f
0000000018 00000 f
0000002323 00000 n
0000000021 00000 f
0000002547 00000 n
0000002609 00000 n
0000000000 00000 f
0000031189 00000 n
0000031970 00000 n
0000032086 00000 n
0000032310 00000 n
trailer


We are attaching 3 documents:
1) Original document (with one full XRef table) - test_no_sig.pdf
2) Converted document (with gaps) - test_no_sig_converted.pdf
3) Manually fixed document - test_no_sig_converted_corrected.pdf


Here is the conversion code snippet (C#):


Aspose.Pdf.Document pdf = new Aspose.Pdf.Document(file);
pdf.Convert(@"e:\aspose1.log.txt", Aspose.Pdf.PdfFormat.PDF_A_1B, Aspose.Pdf.ConvertErrorAction.Delete);
pdf.Save(@"e:\converted_eDane1.pdf");


This is a production problem, so please fix it with higher priority.


Thanks and regards,
Jaro

Hi Jaro,


Thanks for using our API’s.

I have tested the scenario and have observed that resultant file is resultant PDF/A_1b compliant. The source file do not contain any signature, so I am unable to notice signature corruption issue. For your reference, I have also attached the resultant PDF generated with Aspose.Pdf for .NET 11.9.0.

Hi Nayyer,

It seems you are missing the point. The actual problem is: the XRef table is not built according to the spec. It has some consequences and one of them is that multiple signatures cannot be validated in Adobe Reader (even though the document is opened seemingly normally).

Even in the document you attached there is the following XRef table:

xref
0 14
0000000000 65535 f
0000000018 00000 n
0000000107 00000 n
0000000278 00000 n
0000000330 00000 n
0000000481 00000 n
0000000565 00000 n
0000000715 00000 n
0000000738 00000 n
0000001006 00000 n
0000001192 00000 n
0000001262 00000 n
0000001486 00000 n
0000001614 00000 n
16 1
0000002288 00000 n
18 2
0000002512 00000 n
0000002574 00000 n
21 4
0000031412 00000 n
0000032193 00000 n
0000032309 00000 n
0000032533 00000 n
trailer

To see the XRef table, just open the PDF in a proper text editor and scroll down to the end - you will find it there. This kind of XRef tables is produced in Aspose.PDF versions 11.5.0 to 11.9.0 - we have tested it. This is NOT PDF/A-1b compliant document, not 100%. I pasted you part of the specification in the previous post.

Yes, Aspose library won't notice it is incorrect, but it is incorrect when you add several signatures to the document.

I'm attaching you your document signed byt wo signatures (test_no_sig_converted_eDane1_signed.pdf). When you analyze it, you'll see that the firts part (up to the signature revisions) is bit-by-bit exactly the same, but the first signature shows in Acrobat Reader in red (screenshot attached as well - test_no_sig_converted_eDane1_signed_Adobe.png).

Then I'm attaching another - fixed document with a fixed XRef table (test_no_sig_converted_eDane1_fixed.pdf). This document is your document fixed by us manually (see the XRef table at the end of the document).

And the last document is the second document after double signing (test_no_sig_converted_eDane1_fixed_signed.pdf). When opened in Acrobat Reader both signatures are green. (screenshot attached as well - test_no_sig_converted_eDane1_fixed_signed_Adobe.png).

Please send it straight to your DEV team, they will understand.. we did the analysis for you, so that you can fix it quickly.

BR,

Jaro

Hi Jaro,

Thanks for sharing the details.

Based on earlier shared details, I have logged an investigation ticket as PDFNET-41272 in our issue tracking system. We will further look into the details of this problem and will keep you updated on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

Hi,

This problem seems to be still unresolved in Aspose.Pdf 16.10.1.

Jaro

Hi Jaro,


Thanks for your patience.

The issue reported earlier is still under investigation and is not yet resolved. However as soon as we have some definite updates regarding its resolution, we will let you know.

We are sorry for this delay and inconvenience.

The issues you have found earlier (filed as PDFNET-41272) have been fixed in Aspose.Pdf for .NET 17.2.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.

Hi support,


It has not been completely fixed. It works with the document from the very first post (test_no_sig.pdf), but it does not with the document that sent Nayyer (test_no_sig_converted_eDane1.pdf).


I am attaching the results of XRef tables of converted documents using the following code snippet (one with false, one with true for the IsXrefGapsAllowed):


            Aspose.Pdf.Document pdf = new Aspose.Pdf.Document(file);


            pdf.IsXrefGapsAllowed = false;


            pdf.Convert(@"aspose.log.txt", Aspose.Pdf.PdfFormat.PDF_A_1B, Aspose.Pdf.ConvertErrorAction.Delete);


           


            // Save output document


            pdf.Save(@"e:\converted_aspose.pdf");


The source document was the one that sent Nayyer (test_no_sig_converted_eDane1.pdf). Even if the property is set to false, the document contains gaps in the XRef table..


The same results are for both PDF/A-1b and PDF/A-1a.


Jaroslav Ondriska

Hi Jaroslav,


Thanks for sharing the details.

The information has been associated with earlier reported issue and product team has been intimated to further look into this matter. As soon as we have some further updates, we will let you know.

We are also facing same issue, it creates invalid pdf with sections in Xref table.
find pdf generated and signed pdf here is showing invalid : Signed pdf with LTV single sign: https://www.sendspace.com/file/ba8leq

It is still not fixed, fix it on priority and let us know.

@cyginfo

The issue was already resolved in Aspose.PDF for .NET 17.6 version and in case you are still facing it, please share respective input PDF document and sample code snippet with us. We will further proceed to assist you accordingly.