Embedding files into PDF/A-3

Hello,

I’m trying to embed an xml and a pdf file into an existing pdf and do a pdf/a-3a conversion.
This works and both adobe and the foxit reader display the converted pdf/a-3a with both attachments.
Foxit-Reader however displays an “unknown file size” of both attachments, and a customer trying to extract
the attachments has run into problems related to the size.
I am using Aspose.PDF 11.2 and the problem seems to be that the params object after creating a filespecification is nothing and thus does not contain the size of the file.
I suspect this may be a bug in this Aspose.PDF version. Is there a way to get a params object in the file specifications which is correctly filled with file size and modified date of the files?

            Dim spdf as String = "c:\test\sourcepdf.pdf"

            Dim doc As Document = New Document(spdf)

            Dim embeddedXML As FileSpecification = New FileSpecification("c:\test\abc.xml", "abc.xml")

            doc.EmbeddedFiles.Add(embeddedXML)

            doc.EmbeddedFiles.Item(1).Name = Path.GetFileName(doc.EmbeddedFiles.Item(1).Name)
            doc.EmbeddedFiles.Item(1).UnicodeName = Path.GetFileName(doc.EmbeddedFiles.Item(1).UnicodeName)


            'Add attachment to document's attachment collection
            Dim embeddedPDF As FileSpecification = New FileSpecification("c:\test\abc.pdf", "abc.pdf")
            doc.EmbeddedFiles.Add(embeddedPDF)

            doc.EmbeddedFiles.Item(2).Name = Path.GetFileName(doc.EmbeddedFiles.Item(2).Name)
            doc.EmbeddedFiles.Item(2).UnicodeName = Path.GetFileName(doc.EmbeddedFiles.Item(2).UnicodeName)

            ' perform PDF/A_3a conversion
            doc.Convert(Path.Combine(myDir, "log.xml"), PdfFormat.PDF_A_3A, ConvertErrorAction.Delete)
            ' save final PDF file
            doc.Save(Path.Combine(myTargetDir, Path.GetFileName(spdf)))

@Linni

Thank you for contacting support.

I would like to request you to share with you that you are using quite an old version of Aspose.Pdf API so please upgrade to Aspose.Pdf for .NET 17.12, which is the latest available version at the moment. In case the issue persists with latest version, please share source and generated PDF files along with sample XML file, so that we may investigate further to help you out.

With 17.12. I get an error message at the moment I try to embed the first file:

doc.EmbeddedFiles.Add(embeddedXML)

System.ArgumentException: “Tree structure is not initialized.”

@Linni

Kindly share requested files with us so that we may proceed further to help you out.

Hello,

trying with version 17.12 of Aspose.PDF got me an error at the moment I try to add a file to doc.embeddedfiles.
(„Tree structure could not be initialized.“, see the code sample below).

If suspect the problem is somewhere here:
The FileSpecification Object I create using the command below has a params object which is null (nothing).
It doesn’t create a params object property, this doesn’t work with v17.12. either, even though the file size and modified date should be known from the file.

Dim embeddedPdf As FileSpecification = New FileSpecification(spdf, Path.GetFileName(spdf))

So once I add the embeddedPdf to doc.embeddedfiles, it IS added to the final PDF/A-3A, but Size and Modified display as empty in the FoxItReader and our customer’s following system can’t extract the attachments since it has problems determining their size.

Above I’ve included a PDF which was created with version v11.2. Below is the sample code I used.
Is there any way to get the size and modified date into the PDF/A-3A correctly?

Sub Main()

    Dim licensePdf As New Aspose.Pdf.License
    licensePdf.SetLicense("Aspose.Total.lic")

    Dim myDir As String = "C:\Temp\"
    Dim myTargetDir As String = "C:\Temp\Fertig\"

    Directory.CreateDirectory(myTargetDir)

    Dim sourcePDFs As String() = Directory.GetFiles(myDir, "*.pdf")


    For Each spdf In sourcePDFs


        Dim doc As Document = New Document(spdf)
        If File.Exists(spdf.Replace("*.PDF", "*.XML")) Then

            Dim embeddedXML As FileSpecification = New FileSpecification(spdf.Replace(".PDF", ".XML"), Path.GetFileName(spdf.Replace(".PDF", ".XML")))

            'Here I get "Tree Structure not initialized with Aspose.PDF Demo 17.12
            doc.EmbeddedFiles.Add(embeddedXML)


            doc.EmbeddedFiles.Item(1).Name = Path.GetFileName(doc.EmbeddedFiles.Item(1).Name)
            doc.EmbeddedFiles.Item(1).UnicodeName = Path.GetFileName(doc.EmbeddedFiles.Item(1).UnicodeName)



            'Add attachment to document's attachment collection
            Dim embeddedPdf As FileSpecification = New FileSpecification(spdf, Path.GetFileName(spdf))
            doc.EmbeddedFiles.Add(embeddedPdf)

            doc.EmbeddedFiles.Item(2).Name = Path.GetFileName(doc.EmbeddedFiles.Item(2).Name)
            doc.EmbeddedFiles.Item(2).UnicodeName = Path.GetFileName(doc.EmbeddedFiles.Item(2).UnicodeName)

            ' perform PDF/A_3a conversion
            doc.Convert(Path.Combine(myDir, "log.xml"), PdfFormat.PDF_A_3A, ConvertErrorAction.Delete)
            ' save final PDF file
            doc.Save(Path.Combine(myTargetDir, Path.GetFileName(spdf)))

        End If
    Next

End Sub

@Linni

Kindly share your source PDF file along with the files you are embedding in that PDF file so that we may proceed further to help you out. You may attach the files here as a single .zip directory, or you may upload it to some free file hosting server like Google Drive, Dropbox etc and share the link to access those files.

Hello,

here are the sample files. The PDF is supposed to be converted into a pdf/a-3 and the xml file and itself are supposed to be embedded as attachments.

SKonicaMino17112812330.zip (187.2 KB)

Best regards,

Markus Linneweber

@Linni

Thank you for sharing requested data.

I have worked with the data shared by you and have not been able to reproduce the issue. The code sample shared by you is executed without any problem and resultant PDF file is fine, as per your requirements. I have attached generated file for your kind reference. SKonicaMino17112812330.PDF. Please try using Aspose.PDF for .NET 18.1 in your environment and then share your kind feedback with us.

Hello,

I’ve tried the demo version of 18.1 and had the problem that it “couldn’t initialize the tree structure” as
well at the same place.
Also the generated PDF/A-3 which you kindly sent me still is missing the information in the FoxItReader
(See the Missing Information.png (374.8 KB)).
I suspect the problem in the engine is that if I create a FileSpecification object with
Dim embeddedXML As FileSpecification = New FileSpecification(spdf.Replace(".PDF", “.XML”), Path.GetFileName(spdf.Replace(".PDF", “.XML”))),
the Params property of the FileSpecification is not filled automatically but is Nothing/Null
(See: MissingParams.png (11.7 KB)).
According to the documentation the Params Class contains properties like the size and modified date of an attachment.

Those informations which would need to be embedded into the PDF/A-3 are missing from the final PDF/A-3.

Here is a screenshot of the Tree Structure Error: TreeError.png (12.2 KB)

@Linni

Thank you for elaborating it further.

I have worked with the data shared by you and have been able to notice Missing Information and Missing Params in generated PDF file. A ticket with ID PDFNET-44112 has been logged in our issue management system for further investigation and resolution. The issue ID has been linked with this thread so that you will receive notification as soon as the issue is resolved.

However, I have not been able to reproduce the Exception reported by you. It is not being thrown even when I have checked all the options in Visual Studio Exceptions (See Exceptions.JPG). If such an exception is being thrown in your environment, then resultant PDF/A-3 file is also being created on your side in parallel? Kindly share a sample project reproducing the issue because I am unable to replicate it using the code you have shared already.

We are sorry for the inconvenience.

Hello,

the demo code I sent was the code I used. Perhaps it only appears using the evaluation version of 18.1.
The PDF/A-3 is not created in parallel.
I’ve attached the test project: AsposePDF2PDF3a.zip (240.7 KB)
(which has worked with Aspose.PDF version 12.1 apart from the missing attachment properties) and is currently using 18.1.

(I’ve had to remove the dll. and the packages else it wouldn’t let me upload the zip file due to size). I added a reference to Aspose 18.1 using the nuget package manager.

@Linni

Thank you for sharing requested data.

I have worked with the data shared by you and have been able to reproduce the Tree Structure Exception in our environment. A ticket with ID PDFNET-44147 has been logged in our issue management system for further investigation and resolution. The issue ID has been linked with this thread so that you will receive notification as soon as the issue is resolved.
We are sorry for the inconvenience.

Hello,

is there some new information about when those tickets would be resolved approximately?

Thanks in advance!

@Linni

I would like to share with you that the issues reported by you, PDFNET-44112 and PDFNET-44147, are pending for analysis. I am afraid that the investigation and resolution may not be this quick and can take several months owing to previously logged issues.

However, we also offer Paid Support, where issues are used to be investigated with higher priority. Our customers, who have paid support subscription, report their issue there which are meant to be investigated urgently. In case your reported issue is a blocker, you may please consider subscribing for Paid Support. For further information, please visit Paid Support FAQ.