Gigantic file size when saving Aspose.Cells.Workbook as PDF

Dear support,

when converting Excel files (*.xlsx) into PDF files using the Aspose.Cells component, in some cases the file size (and processing time) grows enormously.

I’ll attach a sample project containing 3 xlsx files with anonymized data. (I had to replace the content with random strings, due to sensitive data, but the issue doesn’t change with the altered data.)
All of these files have been converted to PDF using 3 different methods:

  1. Using Excel (Microsoft® Excel® for Microsoft 365 MSO (Version 2502 Build 16.0.18526.20168) 64-bit) and “Print to PDF”
  2. Using Excel and “Save as ‘*.pdf’”
  3. Using the provided console application (C#, .NET Framework v4.8, Aspose.Cells 2025.03)

The corresponding file sizes are as follows:

xlsx Excel Print to PDF Excel Save as pdf Aspose.Cells
Filename KB KB multiple of xlsx KB multiple of xlsx KB multiple of xlsx
Anonymized_Data1 118 316 2.68 374 3.17 340 2.88
Anonymized_Data2 2080 6233 3.00 9801 4.71 66249 31.85
Anonymized_Data3 19 372 19.58 71 3.74 30 1.58

Predictably PDFs aren’t quiet as efficient in storing data as an Excel sheet is, so I was expecting some increase in file size either way, but increasing from 2MB to 66MB for the second file seems a bit excessive. This increases the file size by more than 30 fold.

In the original excel sheet, which again, I am not allowed to hand out, the difference was even more obvious with an increase from 1.2MB (xlsx) to 67MB of PDF; which is an increase by almost 55 fold.

Could you please investigate this enormous increase in file size and either provide any comprehensible reasoning for this drastic increase in file size or even better implement a fix to keep PDF file sizes to a necessary minimum?

Not all conversion have this issue though, as shown in the above table, for file #3 you even outperform the file sizes generated by both Excel functions.

excel_to_pdf_file_size.zip (3.3 MB)

Unfortunately, I had to skip the PDF documents of the second file, because of upload file size restrictions. I’m sure though, you’ll be able to recreate the massive file size when executing the given console application!

Btw.: Your upload dialog, when posting, displays a limit of 48.8MB, but it wasn’t even accepting 14.8MB.

@M.Heinz

When converting Excel files to PDF using Aspose.Cells, it’s not uncommon to encounter significant increases in file size, as you’ve observed. This can be attributed to several factors, including the complexity of the Excel file, the content types, and the settings used during the conversion process.

To help mitigate the file size issue, you can utilize the PdfSaveOptions class to optimize the PDF output. Specifically, you can set the optimization type to MINIMUM_SIZE, which is designed to reduce the size of the generated PDF. Here’s a sample code snippet demonstrating how to implement this:

using Aspose.Cells;
using Aspose.Cells.Rendering;

class Program
{
    static void Main()
    {
        // Load the Excel file into a Workbook object
        Workbook workbook = new Workbook("path/to/your/file.xlsx");

        // Create PdfSaveOptions and set the optimization type
        PdfSaveOptions options = new PdfSaveOptions();
        options.OptimizationType = PdfOptimizationType.MINIMUM_SIZE;

        // Save the workbook as a PDF with the specified options
        workbook.Save("path/to/output/file.pdf", options);
    }
}

By applying this optimization, you may see a reduction in the file size of the generated PDF. However, keep in mind that the effectiveness of this approach can vary based on the specific content and structure of your Excel files.

If you continue to experience excessive file sizes even after applying these optimizations, it may be beneficial to review the content of your Excel files. Certain elements, such as high-resolution images or complex formatting, can contribute significantly to the size of the resulting PDF.

For further assistance, consider sharing specific details about the content of your Excel files or any additional settings you are using during the conversion process. This information can help in diagnosing the issue more effectively.

Sources:
[1]: SaveExcelToPDF.java

Unfortunately the Option PdfOptimizationType.MINIMUM_SIZE doesn’t change anything in the grand scheme of things.

Ripping off approx. 600KB of a 66MB file is not the solution we’re looking for.

@M.Heinz,

Thanks for the template Excel files, sample app and details.

After initial testing, I am able to reproduce the issue as you mentioned by using your template Excel file (“Anonymized_Data2.xlsx”) to convert to PDF. I encountered an excessively large file size when saving Aspose.Cells.Workbook as a PDF. Moreover, Aspose.Cells takes more time to convert the file to PDF.

We require thorough evaluation of the issue. We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSNET-58078

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Apologies for the inconvenience caused. You can upload file up to 10MB in size. We will fix it as well.

I can confirm, the processing time also skyrockets on our end.

We’re also using a secondary processing pipeline to process these documents which I’ve not mentioned as of yet, as I was still trying to build an other PoC console app for this one. With that pipeline, we’ve managed to generate a 214MB PDF file using the same Original_Data2.xlsx document and almost no editing - which took even longer. If I manage to reproduce this behavior in an other PoC, I’ll make sure to let you know.

@M.Heinz,

Thanks for your confirmation on it. We will also address it.

Sure, please take your time to create a console demo app with resource files as PoC and post us. We will look into it as well.

Fyi:

After careful consideration, I won’t bother you with investigating the other processing pipeline as it is not remotely practical to convert the previously mentioned (roughly 57 x 3300 cells) worksheet into a Word document [neither by recreating the table in Word nor by using the Aspose.Cells.Workbook.Save(*, SaveFormat.Docx)] and then converting the Word document to a PDF file.

Just for reference, converting the Excel to Word doesn’t take long (probably a few seconds) and increases the file size from ~2MB to ~3.6MB, whereas saving this intermediate Word document as a PDF takes roughly 15 minutes and increases the file size to 228MB, which is more than 100 fold the original Excel file size.

@M.Heinz,

Do you use Aspose.Words for converting Word documents to PDF? If yes, you may post/share your issue along with resource files in the Aspose.Words forum: Aspose.Words Forum.

Yes, we do.

But I don’t think the use-case is of practical relevance as it is infeasible to display 57 columns of a Word table on a single page’s width without squashing the columns so hard, that you’re unable to read anything.

That’s why we will refrain from filing this issue - unless you specifically want to analyze the performance on a completely “broken” Word document.

@M.Heinz,

You’re absolutely right; trying to fit 57 columns on a single page would make the data too small to read clearly. However, the time cost (e.g., 15 minutes) still seems excessive, and the file size (228MB) is quite large. I would suggest posting the issue along with a sample app, including the necessary resource files.

I’ll try to isolate the issue and, if possible, make sure to post it in the Aspose.Words forum.

@M.Heinz,

Thank you for your efforts. If you encounter any other issue(s) or have questions regarding Aspose.Cells APIs, please don’t hesitate to reach out to us.

Just in case you want to follow the respective Word to PDF issue: Gigantic file size when saving Aspose.Words.Document as PDF

@M.Heinz
Thank you for your feedback. We have created the relevant issue CELLSNET-58078 for Cells product. We will notify you promptly once there are any updates.

Thanks @John.He ,
your colleague has already pointed that one out. I’ve been talking to them about an additional Aspose.Words issue though, which at first I didn’t want to file due to the structure/relevance of the test data, but they were interested in the report anyways. C.f. Gigantic file size when saving Aspose.Cells.Workbook as PDF - #8 by amjad.sahi.

@M.Heinz,

Thank you for your additional feedback.

I think it is good that you’ve raised the Word-to-PDF performance and size issue in the Aspose.Words forum. Kindly allow the Aspose.Words team some time to thoroughly evaluate the matter. They will respond to you in the thread with their findings and further details.

No worries there. As stated earlier: For this specific document, the result is impractical to view in Word anyways, so I’m not too concerned about any immediate resolution of the Words issue.

If they’re able to improve the performance of the Aspose.Words component using the specified testcase though, I’m not against it by all means :wink: .

@M.Heinz,

Alright and sure, let Aspose.Words team gets back to you with their analysis and findings. In the meantime, we will continue addressing the issue (i.e., CELLSNET-58078 - Gigantic file size when saving Aspose.Cells.Workbook as PDF). Rest assured, we will update you here as soon as we have new updates or make any progress to share.