Converting into PDF/A results in large file size

We are facing an issue with the Aspose PDF .net library (version 19.11.0.0) that require us to use another component.

The issue is related to conversion of a PDF into PDF/A. This seems to results into a much bigger file and blocks us to deliver our documents by mail.

We don’t have this issue when using same functionality with a component from a competitor .

Thanks ahead for your support

@bart.jacxsens

Would you please try using latest version of the API i.e. Aspose.PDF for .NET 20.6 and in case you still experience this issue, please share your sample PDF document with us. We will test the scenario in our environment and address it accordingly.

Tested with v 20.6 and the issue persists, attached is sample pdf that from 3kb becomes 43kb after conversion. sample.pdf (3.0 KB)

@Barbara.Kuzmin

Would you please share in which PDF/A format you want to convert your PDF document (e.g. PDF/A_1a, PDF/A_2a, etc.)?

@asad.ali
We are converting to PDF_A_1B.

@Barbara.Kuzmin

We used following code snippet with Aspose.PDF for .NET 20.7 and noticed that output file size was 29KB.

Aspose.Pdf.Document doc = new Aspose.Pdf.Document(dataDir + "sample.pdf");
PdfFormatConversionOptions options = new PdfFormatConversionOptions(PdfFormat.PDF_A_1B);
options.ConvertSoftMaskAction = ConvertSoftMaskAction.ConvertToStencilMask;
options.ExcludeFontsStrategy = PdfFormatConversionOptions.RemoveFontsStrategy.SubsetFonts | PdfFormatConversionOptions.RemoveFontsStrategy.RemoveDuplicatedFonts;
options.LogStream = new MemoryStream();
options.ErrorAction = ConvertErrorAction.Delete;
options.OptimizeFileSize = true;
doc.Convert(options);
doc.OptimizeResources(
                new Aspose.Pdf.Optimization.OptimizationOptions()
                {
                    SubsetFonts = true,
                    RemoveUnusedStreams = true,
                    RemoveUnusedObjects = true,
                });
// Save output document
doc.Save(dataDir + "sample.output.pdf");

sample.output.pdf (28.2 KB)

Would you please share if this output size is acceptable for you? We will further proceed to assist you accordingly.

1 Like

Hi,

I got the same problem. Some documents are after converting into PDF/a 2B much bigger (I have one that is 19 kb and after converting 117 MB)

I try the posted code, but I run into two problems:

  1. When I set SubsetFonts = true in OptimizationOptions, then the pdf is after OptimizeResouces not an PDF/a. The PDFFormat-property display PDF 1.7, without SubsetFonts it works
  2. For some pdf’s I got an exception while saving: “cannot access a closed stream”
    2017-Scrum-Guide-German.pdf (907.9 KB)

We are using Aspose.PDF 21.3

//var convetDoc = new Document(...);
var options = new PdfFormatConversionOptions(PdfFormat.PDF_A_2B);
options.ConvertSoftMaskAction = ConvertSoftMaskAction.ConvertToStencilMask;
options.ExcludeFontsStrategy = PdfFormatConversionOptions.RemoveFontsStrategy.SubsetFonts | PdfFormatConversionOptions.RemoveFontsStrategy.RemoveDuplicatedFonts;
options.ErrorAction = ConvertErrorAction.None;
options.OptimizeFileSize = true;

byte[] pdfDoc;
convertDoc.Flatten();
using (var memoryStream = new MemoryStream())
{
    options.LogStream = memoryStream;
    convertDoc.Validate(options);
 }

 using (var memoryStream = new MemoryStream())
 {
     options.LogStream = memoryStream;
     convertDoc.Convert(options);
  }
        
  using (var strmOut = new MemoryStream())
  {
      convertDoc.OptimizeResources(
          new Aspose.Pdf.Optimization.OptimizationOptions
          {
              //SubsetFonts = true,
              RemoveUnusedStreams = true,
              RemoveUnusedObjects = true,
          });

          convertDoc.Save(strmOut);
           
          pdfDoc = strmOut.ToArray();
    }

@haenkema

We tested the scenario in our environment using 21.4 version of API and noticed that it threw the same exception when OptimizeFileSize Option was set. Hence, an issue as PDFNET-49868 has been logged in our issue tracking system for the sake of correction. We will further look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hi,

what is the status of the issue? Is it fixed?

@Versus2020

We are afraid that the earlier logged ticket could not get resolved due to other pending issues in the queue. We will surely let you know as soon as we have some definite updates regarding its resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.

what is the status of the issue? Is it fixed?

@Verus2020

The issue is currently under the phase of the investigation and as soon as we complete it, we will be able to share some news about its fix or resolution ETA. We will let you know once we have some updates in this regard. Please spare us some time.

We are sorry for the inconvenience.