Conversion from pdf to PDF/A format output file size increased in more than 10 times

Hello Aspose team,

We are using Aspose.PDF v22.1. for .NET
We found that after conversion from pdf to PDF/A format output file size increased in more than 10 times.
We are not using any Aspose optimization functionality, because we don’t want to lose any info, which customer could put behind the document (for example metadata or something else). For us so high increasing file size is unacceptable.
For conversion we have following code:

    private MemoryStream ConvertToPdfa(Stream stream)
    {
        var pdf = new Document(stream);

        using (var outputStream = new MemoryStream())
        {
            var pdfsecurity = new PdfFileSecurity();
            pdfsecurity.BindPdf(pdf);
            pdfsecurity.DecryptFile("");
            var isFileConverted = pdf.Convert(outputStream, PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);
        }

        var destinationStream = new MemoryStream();
        pdf.Save(destinationStream, SaveFormat.Pdf);

        return destinationStream;
    }

Could you please suggest us how we can solve this issue

Best regards

@gomezw,

Can you please attach the document you are having this issue with?

INC1023853790 (1) (002).pdf (360.1 KB)

@gomezw,

I just executed the following code and it does not increase in size:

private void Logic()
{
    var doc = new Document($"{PartialPath}_input.pdf");

    doc.Save($"{PartialPath}_output.pdf"); // Making copy

    var docNew = new Document($"{PartialPath}_output.pdf");

    var outputStream = new MemoryStream();

    var isFileConverted = docNew.Convert(outputStream, PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);

    if (isFileConverted)
    {
        
        docNew.Save(outputStream, SaveFormat.Pdf);
    }
}

Input and output:
FormatIncreaseFileSize_input.pdf (360.1 KB)
FormatIncreaseFileSize_output.pdf (329.0 KB)

What was the idea with the decrypt lines?

I could see FormatIncreaseFileSize_output.pdf is not pdfa format, but we in our code converting to pdfa format, probably that’s why in your case it’s not increased

image.png (17.7 KB)

Probably, that’s because you save conversion result to output aspose log stream, if you
change your code from:
docNew.Save(outputStream, SaveFormat.Pdf);
to:
docNew.Save($"{PartialPath}_output.pdf", SaveFormat.Pdf);
you will see difference

@gomezw,

I was able to replicate the issue. I will create a bug for the dev team.

But for now, work with a stream so you do not have to worry about the increased size.

@gomezw
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-54334

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Actually we are working with stream, it is also reproducible with stream.

 private void Logic()
 {
        var inputFilePath = $"{PartialPath}_input.pdf";
        var outputFilePath = $"{PartialPath}_output.pdf";
        var inputStream = new MemoryStream(File.ReadAllBytes(inputFilePath));

        var doc = new Document(inputStream);
        
        var asposeLogOutputStream = new MemoryStream();

        var isFileConverted = doc.Convert(asposeLogOutputStream, PdfFormat.PDF_A_3B, ConvertErrorAction.Delete);

        if (isFileConverted)
        {
            var outputStream = new MemoryStream();
            doc.Save(outputStream, SaveFormat.Pdf);
            File.WriteAllBytes(outputFilePath, outputStream.ToArray());
        }
    }

Please, let me know when we can expect fix for this bug?

@gomezw,

Sadly, I am not part of the dev team or have any input on their priorities. The post I made before has a link to the policies regarding how bugs are treated and the order they are solved. Please review it so you have more information about that,

Wo do have the same problem with the java version.
Converting an existing PDF file to PdfFormat.PDF_A_3B or PdfFormat.ZUGFeRD results in a generated file with filesize that is increased more than 10 times than its source file.
A solution would be great.

Best regards
Matthias

@curmas

Would you kindly share your sample file and the sample code with us as well? We will log a dedicated ticket for your case and share the ID with you.

@asad.ali ,
ok, i will do after my holidays :wink:
Best regards
Matthias

Hi @asad.ali,

i nearly forgot to send the sample files. Here they are:

  • ZUGFeRD-invoice EXTENDED 2p0.xml
    = data file for zugferd pdf (30KB)
  • Rechnungsmuster.pdf
    = Example pdf used for zugferd generation (200KB)
  • Zugferd Aspose Rechnungsmuster EXTENDED 2p0.pdf
    = Result of zugferd generation per Aspose (3MB)
  • Zugferd OtherTool Rechnungsmuster EXTENDED 2p0.pdf
    = Result of zugferd generation per other software tool (300KB)
    (may be this result file of other tool helps for fixing the problem)

Sample files.zip (3.2 MB)

Kind regards
Matthias

@curmas

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-43543

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

The issues you have found earlier (filed as PDFJAVA-43543) have been fixed in Aspose.PDF for Java 24.11.