Optimize/Compress PDF documents in C# using Aspose.PDF - Optimization size varies with different PDFs

We have used below code for optimizing the PDF but the result of the optimization is not stable.

For example a 170kb PDF after optimization we got 150kb whereas another PDF with 100kb after optimization we got 110kb. So there is 10kb increase in size after optimization.
We would like to know which all components have been edited for getting smaller and larger size after optimization.

using (Aspose.Pdf.Document doc = new Aspose.Pdf.Document(fileName))
{
try
{
doc.OptimizeResources();
doc.Save(fileName);
}
catch
{

    }
}

We have purchased Aspose.PDF and we are happy to share the details if required.

@sreeraj_05

Would you please share your sample PDF file with us so that we can test the scenario in our environment and address it accordingly.

After optimization.pdf (213.8 KB)
Before optimization .pdf (160.6 KB)
Please find the attached files.

@sreeraj_05

We tried using following code snippet and observed that PDF file size was reduced. We tested the scenario using Aspose.PDF for .NET 20.8:

Document doc = new Document(dataDir + @"Before optimization .pdf");
var optimizationOptions = new Aspose.Pdf.Optimization.OptimizationOptions();
optimizationOptions.RemoveUnusedObjects = true;
optimizationOptions.RemoveUnusedStreams = true;
optimizationOptions.AllowReusePageContent = true;
optimizationOptions.LinkDuplcateStreams = true;
optimizationOptions.UnembedFonts = true;
optimizationOptions.ImageCompressionOptions.CompressImages = true;
optimizationOptions.ImageCompressionOptions.ImageQuality = 30;
optimizationOptions.ImageCompressionOptions.ResizeImages = true;
optimizationOptions.ImageCompressionOptions.Version = Aspose.Pdf.Optimization.ImageCompressionVersion.Fast;
optimizationOptions.ImageCompressionOptions.MaxResolution = 72;

doc.OptimizeResources(optimizationOptions);
doc.Save(dataDir + @"ExampleOptimized_20.8.pdf");

ExampleOptimized_20.8.pdf (137.3 KB)

Hi @asad.ali,

Thanks for your suggestions. But we have few issues with the code.

  1. We need the fonts to be embedded with the PDF, preferably Embedded Subsets. So we modified the 7th line of your code with below two lines:

optimizationOptions.UnembedFonts = false;
optimizationOptions.SubsetFonts = true;

But we are getting few exceptions for different PDFs, like - “Parameter is not valid”, “Object reference not set to an instance of an object”, “An item with the same key has already been added” and “Font subsetting is prohibited because of license restrictions”. I have attached sample PDFs for these exceptions.
Also, in some cases, still the size increases after optimization(sample PDF attached).

Font subsetting is prohibited - license restrictions.pdf (133.3 KB)
Null Reference Exception.pdf (164.4 KB)
Parameter is not valid.pdf (1.1 MB)
Size Increases After Optimizing.pdf (180.7 KB)

Also,
2. We would like to know if there is any functionality in Aspose.PDF to consolidate all duplicate fonts. The reason I ask is the PDF generated from Aspose is opened by clients in Adobe Acrobat Standard DC. When we do ‘Save As’ in Acrobat Standard it shows “Consolidating Duplicate Fonts” and hangs. We would like to generate a PDF where fonts are already consolidated. We want to make Embedded fonts or Subsetted fonts display as single Font Family in the Font properties.

Please let us know how to achieve this.

Thanks!

@sreeraj_05

We have been able to notice the mentioned issues in our environment and logged them as:

  • PDFNET-48728
  • PDFNET-48729
  • PDFNET-48730
  • PDFNET-48731

We will further look into their details and keep you informed about the status of their correction. Please be patient and spare us some time.

Could you kindly share the screenshot of the error that Adobe Reader shows and hangs? Furthermore, please share the problematic source PDF, output PDF, and an expected output document for our reference so that we can further proceed to assist you accordingly.

Hi @asad.ali,

Below is the screenshot from Adobe Acrobat Standard DC showing “Consolidating duplicate fonts”. This message is shown while we use Save As to save a copy of large PDFs which are greater than 10MB and contains duplicate fonts as in attached screenshot.

Consolidating duplicate Fonts.PNG (1.6 KB)
Duplicate Fonts.png (14.3 KB)

Most of the PDFs with size less than 25MB gets saved without any issues where the message “Consolidating duplicate fonts” completes within 5 to 10 seconds. In case of large PDFs about 66MB and more than 100 duplicate fonts, the Save As functionality in Adobe Standard gets stuck at “Consolidating duplicate Fonts” for about 1 or 2 hours and then crashes.

Below is a sample PDF which shows 8 fonts out of which 4 are duplicates. We would like to remove the duplicate fonts and display as 4 fonts instead of 8.

Test PDF with Duplicate Fonts.pdf (66.3 KB)

Also, please note that this issue does not happen in Adobe Reader DC. It happens only in Adobe Acrobat Standard/Professional version. Also it does not happen in PDFs with less than 10MB.

Please let us know if you need further information.

@sreeraj_05

Thanks for sharing requested details.

We have logged an investigation ticket as PDFNET-48742 in our issue management system after testing the scenario in our environment. We will further look into its details and inform you as soon as it is resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hi @asad.ali/team,

Can we get an estimate of when we will get a solution or workaround for these issues?

Please let us know if there is a provision in Aspose to consolidate fonts in an existing PDF, or consolidating fonts while merging multiple PDF. We tried setting IsSubset to false and it does not work. Also tried unembedding-saving-reembedding and it does not work either. Sometimes the text becomes unreadable and scrambled while doing this.

Thanks!

@sreeraj_05

We are afraid that we cannot share any reliable ETA at the moment as tickets are pending for review. They will be investigated and resolved on a first come first serve basis. We will surely inform you as soon as they are resolved. Please give us some time.

Following options are used during optimization in order to unembed or remove duplicate fonts from PDF:

optimizationOptions.UnembedFonts = false;
optimizationOptions.SubsetFonts = true;

Regretfully, they seem not to be working in your case and we need to further investigate the actual reasons behind it. The ticket PDFNET-48742 has been logged and dedicated for this issue specifically. We will inform you as soon as we have additional updates in this regard.

We apologize for the inconvenience.

@asad.ali

We are facing a critical issue while optimizing PDF using below OptimizationOptions. Some pages become unreadable after optimization.

    public void CompressPDF(string fileName)
    {
        using (Aspose.Pdf.Document doc = new Aspose.Pdf.Document(fileName))
        {
            try
            {
                OptimizationOptions optimizeOptions = new OptimizationOptions
                {
                    RemoveUnusedObjects = true,
                    RemoveUnusedStreams = true,
                    UnembedFonts = false,
                    SubsetFonts = true
                };
                doc.OptimizeResources(optimizeOptions);
                doc.Save(fileName);
            }
            catch
            {
                //Retain original file
            }
        }
    }

Input File: test readable input.pdf (94.0 KB)
Output File: test unreadable output.pdf (130.4 KB)

Please note that not only the file size increases from 94KB to 130KB, but also the pages 2 and 3 becomes unreadable after optimization. This happens when we use “SubsetFonts = true”.

Please let us know the solution for this issue. If you could let us know under what conditions these fonts become unreadable, we could make “SubsetFonts = false” for those conditions. Please give high priority for fixing this.

Thanks.

@sreeraj_05

We have logged an issue as PDFNET-48897 in our issue tracking system after replicating it with Aspose.PDF for .NET 20.10. We will surely look into its details and keep you posted with its rectification status. However, please also note that it will be investigated and fixed on first come first serve basis as per free support policies. We request your patience and comprehension in this matter. Please give us some time.

We are sorry for the inconvenience.

@asad.ali

Is there any way you could expedite and provide a solution? We have been pushing these issues over to subsequent sprints several times. It would be nice if you could provide an ETA. Please let us know how we could move this to paid support.

Thanks.

@sreeraj_05

We are afraid that we cannot provide any ETA at the moment as investigation against the tickets is not yet completed. Furthermore, the issues in free support model are resolved on first come first serve basis. The resolution time of the issue depends upon the complexity of the issue and number of issues logged prior to it. We really apologize for the inconvenience being faced. We will surely inform you as soon as we have some definite updates regarding tickets resolution.

In case you are subscribed to paid support, you can login there using the same email address which was used to purchase subscription. You need to create a post there with the reference of ticket which you want to get resolved on urgent basis and it will be expedited accordingly.

HI @asad.ali,

Could you please let us know the status of the investigation tickets?
Our goal is to accomplish below two tasks, which is currently not working with Aspose.PDF in some cases-

  1. Compress PDFs
  2. Consolidate fonts in PDF (while merging PDFs / in a single PDF)

Also, please let us know the complete procedure to avail paid support.

@sreeraj_05

We really regret to inform that investigation against earlier logged tickets is not yet completed due to other issues logged prior to it. However, we will surely let you know about the ETA as soon as we complete analysis. We highly appreciate your patience and comprehension in this regard. Please give us some time.

Please create an inquiry/post in our Purchase Forum in order to get related assistance.

Hi @asad.ali

Could you please let us know the status of the investigation tickets? It would be great if you could share an ETA or the updated package with the fix.

Thanks

@sreeraj_05

We really regret to share that earlier logged ticket are not yet fully reviewed and we are not in a position to share some reliable ETA at the moment. We will surely let you know as soon as we have some certain updates in this regard. We highly appreciate your patience and cooperation in this matter. Please give us some time.

We are sorry for the inconvenience.

Hi @asad.ali
As this issue was pending from your side for a long time, do you have any available fix for the above issue?
If no, please share the next level of escalation contact details for further assistance.

@sreeraj_05

We are afraid that earlier logged tickets are not yet resolved due to other pending issues which were logged previously. However, you can please check our priority support option where issues are dealt on urgent basis. We will further inform you as soon as we have some definite updates regarding tickets resolution.

We really apologize for the inconvenience faced.