PDF to PDF/A conversion: optimization problem

Hello!

I convert pdf-documents to PDF/A this way:

  Document document = new Document(new ByteArrayInputStream(this.sourceContent));
  boolean optimize = true;
  
  PdfFormatConversionOptions options = new PdfFormatConversionOptions(PdfFormat.PDF_A_1A, ConvertErrorAction.Delete);
  options.setOptimizeFileSize(optimize);
  document.convert(options);

  if (optimize) {
  	OptimizationOptions opt = new OptimizationOptions();
  	opt.setRemoveUnusedObjects(true);
  	opt.setRemoveUnusedStreams(true);
  	opt.setLinkDuplcateStreams(true);
  	opt.setSubsetFonts(true);
  	document.optimizeResources(opt);	
  }

  document.save("output.pdf");

Without optimization, the resulting pdf (output.pdf) becomes way too big (1825 KB).
The adobe acrobat memory check shows, that the overhead comes from embedded fonts.

Tests showed, that setting setOptimizeFileSize(true) during conversion to PDF/A is not enough to reduce the file size, in contrast to the detailed explanation in Aspose Pdf uses two ways to optimize PDF file size - #12 by asad.ali . But why?

With additional optimization (document.optimizeResources) enabled as shown above, the size of the resulting pdf (outputOptimized.pdf) is fine (93 KB), but some document content is lost (in this example the page number at the end of page 2 is lost).

Is there a way to convert and optimize this document without losing content?
Attached is a zip with the input.pdf (405 KB) and the two output-pdfs: optimize.zip (2.2 MB)

@dvtdaten

We were able to notice the same issue at our side and we need to investigate the reasons behind it. We have logged an issue as PDFJAVA-39879 in our issue management system in order to determine why PDF file size was not getting reduced during conversion. We will also investigate the issue related to content loss and let you know as soon as the logged ticket is resolved. Please be patient and give us some time.

We are sorry for the inconvenience.

Hello!
Do you have new on this case? A test with Aspose.PDF 21.3 showed that optimizeResources still has problems with losing content, which forces us to not use optimization and accept large output pdf.

@dvtdaten

The issue is already of the highest priority. However, due to its complexity - the investigation process is taking time. You will surely be notified as soon as we make some progress towards its resolution. W greatly appreciate your patience in this regard.

We apologize for the inconvenience being faced.