Convert PDF to PDF/A - Invalid xml in output log stream for Document.Convert call

Hi,

I am trying to convert PDF file using statement:

document.Convert(outputStream, PDF_A_3B, ConvertErrorAction.Delete);

The conversion is unsuccessfull. I use next statements to get output log content:

        stream.Position = 0;
        var asposeLog  = new StreamReader(stream).ReadToEnd();

The content is:

<Compliance Name="Log" Operation="Validation" Target="PDF/A-3B"><Version>1.0</Version><Copyright>Copyright (c) 2001-2021 Aspose Pty Ltd. All Rights Reserved.</Copyright><Date>9/10/2021 10:59:17 AM</Date><File Version="1.4" Name="" Pages="1"><Security /><Catalog><Problem Severity="Error" Clause="Clause" Convertable="False">Can not convert signed file</Problem></Catalog></File>

As you can see “Compliance” tag is not closed - so I cant parse this not well formed xml. It is the first issue.

The second one - the “Can not convert signed file” error. Do I have any way to overcome it except unsign the file?

Best regards,
Severin

@gomezw

We need to further investigate this whole scenario. Can you please share the complete sample code snippet along with sample source PDF? We will test the scenario in our environment and address it accordingly.

test.pdf (1.1 MB)

        var document = new Document(@"..\..\..\..\..\..\..\Data\test.pdf");

        using (var outputStream = new MemoryStream())
        {
            var convertedSuccessful = document.Convert(outputStream, PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);

            outputStream.Position = 0;

            var reader = new StreamReader(outputStream);
            var output = reader.ReadToEnd();
            var xml = XDocument.Parse(output);
        }

What happens with document after call of “Convert” method that returns false? As I understand it returns false when the document was not converted due to some errors. The errors are provided in the outputStream. And the document should be not changed. Is it true?

Also I noticed that initial pdf format of the document in this example code is PDF_A_1B.
After call to “Convert” it is PDF_UA_1. Why was the format changed to PDF_UA_1 when convert returned “false”?

I am using Aspose.Pdf v21.8.0

@gomezw

Even if the conversion is not successful, API still generates the output in give PDF/A format that shows the compliance errors. Which is why you see the updated file format after conversion. We have logged an issue as PDFNET-50566 in our issue tracking system for further investigation. We will look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hi,

We got this error when tried to convert signed PDF document. Is there is a way to tell Aspose component to ignore signing? So that it will convert to requested format without error and return the unsigned result document.

@gomezw

There are some limitations in the case of PDF file format while converting a signed document into other formats. Therefore, we need to perform further investigation in order to determine the exact cause of the issue. We will surely let you know as soon as the analysis is done. Please give us some time.

Hi, Can you give me estimated ETA for PDFNET-50566 issue to be done?

@gomezw

The ticket has recently been logged in our issue management system and it is pending for a review. It will be investigated and resolved on a first come first serve basis. We will surely inform you once we have definite updates regarding its resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.