Considerable regression in size of generated PDF files containing images

I am using Aspose.PDF for .NET 21.10.1 (latest at the time of writing) and I observe that there is a considerable regression in generated PDF file size compared to Aspose.PDF for .NET 10.1.0 if the PDF file contains a JPEG image compressed with small enough quality level (JPEG quality level = 25 in my case, to achieve good JPEG file compression).

Here is the code snippet for Aspose.PDF for .NET 21.10.1 where the issue exists:

using Aspose.Pdf;

namespace ConsoleApp
{
    class Program
    {
        static void Main(string[] args)
        {
            var document = new Document();
            var page = document.Pages.Add();

            var image = new Image(); // Aspose.Pdf.Generator.Image no longer exists
            image.File = @"..\..\Image_JpegQualityLevel25.jpg";
            image.FixHeight = 216;
            image.FixWidth = 216;

            FloatingBox box = new FloatingBox(216, 216);
            box.Top = 72;
            box.Left = 72;
            box.Paragraphs.Add(image);

            page.Paragraphs.Add(box);

            document.Save("broken_size_of_pdf_file_with_image.pdf");
        }
    }
}

Here is the code snippet for Aspose.PDF for .NET 10.1.0 where the issue does not exist:

using Aspose.Pdf;
using ImageFileType = Aspose.Pdf.Generator.ImageFileType;

namespace ConsoleApp
{
    class Program
    {
        static void Main(string[] args)
        {
            var pdf = new Aspose.Pdf.Generator.Pdf();
            var section = pdf.Sections.Add();

            var image = new Aspose.Pdf.Generator.Image();
            image.ImageInfo.ImageFileType = ImageFileType.Jpeg;
            image.ImageInfo.File = @"..\..\Image_JpegQualityLevel25.jpg";
            image.ImageInfo.FixHeight = 216;
            image.ImageInfo.FixWidth = 216;

            var box = new Aspose.Pdf.Generator.FloatingBox(216, 216);
            box.Top = 72;
            box.Left = 72;
            box.Paragraphs.Add(image);

            section.Paragraphs.Add(box);

            pdf.Save("correct_size_of_pdf_file_with_image.pdf");
        }
    }
}

I am attaching Visual Studio solutions both for broken and non-broken case, and also generated PDFs for both cases:

Aspose_PDF_file_size_on_Image_save_issue.zip (151.2 KB)

You will see that my JPEG image file size is 35529 bytes, Aspose.PDF 10.1.0 generates a PDF file that has 36948 bytes in file size, while Aspose.PDF 21.10.1 generates a PDF file that has 71796 bytes in file size, i.e. approximately twice as large as generated by Aspose.PDF version 10. If I have larger JPEG images in my PDF, and/or if I have more images, then this approximately 2:1 PDF file size ratio remains the same, what translates into a substantial and noticeable overall file size regression.

Can you please look into this PDF file size regression issue?

This incident appears to be similar (was reported as resolved back then): The size of file Aspose.Pdf generates when converting from image

@AlexeyM

We are checking the scenario and will get back to you soon with our feedback.

@AlexeyM

We have logged an investigation ticket as PDFNET-50914 in our issue tracking system. We will further look into its details and keep you posted with the status of ticket resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.

Thanks Asad! From the outside it looks like newest Aspose performs double encoding of JPEG data, losing the original content stream provided to it by the user, while older Aspose was not that smart and used the original content stream as it is. I speculate this is done to enforce compliance of JPEG stream with PDF standard? No matter what is the reason, there should be perhaps an option to allow API user bear responsibility for compliance of the data he provides and thus enable possibility for a smaller PDF file size.

@AlexeyM

Thanks for sharing your findings and further concerns. We have updated the ticket information accordingly and will surely perform investigation from this perspective. We will inform you once we have some feedback to share with you. Please spare us some time.