By how much can I expect converting to PDF/A to increase PDF size...and how can I minimize this?

Hello
I am having a serious customer issue with code that we’ve just ported from a different vendor to using Aspose.PDF. Namely, I have a small (55k) form which gets filled in automatically by our software, flattened, and then converted to PDF/A using:
pdfDocument.Convert(new MemoryStream(), PdfFormat.PDF_A_1B, ConvertErrorAction.Delete);

The unflattened filled-in PDF is about 60K; the PDF/A version is about 1.3 MB! I’ve added calls to Document.Optimize() and OptimizeResources() with no benefit.

I saw in the forum (.e.,g Creating huge memory issues while converting PDF to PDF/A - #4 by codewarior ) that it is possible that this is due to the PDF/A-1b requirements…

  1. Is that (still) correct?
  2. Is there any way to minimize the negative effects of PDF/A?
  3. Is the increase I am seeing normal and to be expected?

Thanks in advance!
Larry

@larry.weisberg

Thanks for contacting the support.

The information in the referred thread was shared on the basis of PDF specifications and is still correct.

We are afraid that it would not be in favor of PDF/A compliance. If somehow you manage to reduce the negative effects (in your case), the resultant PDF would not pass the compliance test.

As per the PDF specifications, the behavior is normal and expected. We will further try to investigate the scenario once again for you if you can please share the sample files with us.

RFNTemplate.pdf (51.2 KB)
TestRFN.pdf (1.3 MB)

Hi - I’ve attached 2 files. One is used as a form where we fill in fields, and the other is the final pdf after applying PDF/A. Without posting all my source code :-), in overall terms, after I’ve filled in the fields I do the following…(Because of the way to high-level code is organized, my function to convert to PDF/A might be called more than once on a given document, since this is the method used to get the pdf in byte[] form… but I can’t imagine that is the issue here :slight_smile: )

public static byte[] ConvertToPDFA(byte[] flatpdf) {
using (var streamIn = new MemoryStream(flatpdf))
{
Document pdfDocument = new Document(streamIn);
// Convert to PDF/A compliant document
// During conversion process, the validation is also performed
OptimizeSize(pdfDocument); /// This seems to have no effect; same if I called OptimizeResources()

	PdfFormatConversionOptions ops = new PdfFormatConversionOptions(PdfFormat.PDF_A_1B, ConvertErrorAction.Delete);
	ops.FontEmbeddingOptions.UseDefaultSubstitution = false;

	if (!pdfDocument.Convert(ops))
		throw new ApplicationException($"Converting PDF to PDF/A (bytes) failed");

	var streamOut = new MemoryStream();
	pdfDocument.Save(streamOut);
	pdfDocument.Dispose();
	streamOut.Close();
	return streamOut.ToArray();
}

}

If I do the optimizeResources() AFTER the convert to PDF/A then the file is small. However, when I open in Adobe, I do not see the blue ribbon across the top indicating that it is PDF/A compliant.

Using the previous vendor, I was able to do similar steps in my code, and have Adobe show it was PDF/A compliant, yet the PDF was far less than 100KB, nowhere near the > 1MB I see with Aspose. I have no need for any embedded fonts, but I don’t seem to be able to create a final PDF that isn’t bloated, apparent;y with embedded fonts.

Can anyone suggest a solution/workaround?

Thanks!
Larry

@larry.weisberg

Thanks for sharing sample files and details.

We have logged an issue as PDFNET-47869 in our issue tracking system for the sake of further investigation. We will look into details of the issue and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

Thanks! Looking forward to your response.

Hi. Has any progress been made on this issue?

Thanks,
Larry

@larry.weisberg

We are afraid that earlier logged ticket is still pending for analysis due to low priority. Please note that it was logged under normal support model and will be resolved on first come first serve basis. We will surely inform you as soon as we have some definite updates regarding its resolution. Please spare us some time.

We are sorry for the inconvenience.