Is LightCellsDataProvider available in CellsforPythonvia.NET?

I have to report another strange phenomenon. Before I used the save level options, my normal spreadsheets were 4 meg. When I open and save them, they are slightly smaller like 3.9 meg. So putting on save level 9, they squeezed by about 1/3 to 3 meg. When I open and save them, the get larger back to the 3.9 meg. It seems excel mod then save actually made small files worse and large files better.

@SteveOlson,

Thanks for sharing the sample Excel files and details. We will evaluate them and get back to you with updates.

@SteveOlson
There are too many empty cells in the file, if you save the file with MS Excel, MS Excel will remove all empty cells in the column and set style to column object If these cells contain same style.
Aspose.Cells does not check whether we can remove those empty cells for performance.

So please reduce empty cells with your codes. If all cells in the column contain same style, please set the style to the column (Column.SetStyle()). Then if the style of the cell is same as style of column and there is no data, please do not call Cells[row,column], then no empty cells will generated.
For example : In Column C: most of cells has same borders, so please apply borders style to the column C, then we only need to create a few Cell objects when need.

Also.

  1. We can call Cells.DeleteBlankColumns() method to remove empty columns. But this method shifts right columns.
  2. If you want to reduce the file size, you can ignore the names of the cells as the following :
    OoxmlSaveOptions saveOptions = new OoxmlSaveOptions();
    saveOptions.ExportCellName = false;
  3. It’s better that you have to reduce empty cells.

Thank you! This is great information. One problem with my test data is that I was trying to stress test and thus had some random styling. I will try some new strategies to reduce the size.

@SteveOlson,

You are welcome. We hope the suggested approaches and alternative solutions/techniques will help you address your concerns and problems effectively. Please don’t hesitate to reach out again if you have any additional questions or feedback.

One final note: I changed my template to have all the “base level” formatting (just a border) on all the data columns. I then removed the border at the top header section. So I no longer needed to include a border style at the cell level. I also made the styles more in line with actual reports. I set only 2 columns with actual color styling. I’m not sure if I have a null value for the cell if I need the following if… but I added it:
if cell_value is not None:
cell.put_value(cell_value)
I also kept compression level at 9. The initial size was reduced but only to 175K, then post save that got reduced to 87K (post open and save). I was hoping that not individually formatting the borders for all the empty cells would squeeze it more. I’ll leave some links for those two files in case you want to look. But I think my only option is to find a way to reopen and save in excel. I’m not sure if your row by row solution (beginning of thread) would squish it naturally (if that gets ported to python).

https://www.dropbox.com/scl/fi/z2iecgwn0nrt2zn4d2vtj/LessFormat_1392.xlsx?rlkey=wghdwv0pjkfxym0iuuiydkhsn&st=am0yprng&dl=0

https://www.dropbox.com/scl/fi/47siur930wjhth4koe2nt/LessFormat_1392_after_save.xlsx?rlkey=35cwvhgj3abco288uao8xor2d&st=6x4eo9lq&dl=0

@SteveOlson,

Thank you for sharing the files and details.

We will check and evaluate the files thoroughly and then provide our feedback along with additional insights.

@SteveOlson

Please change your codes :
if cell_value is not None:
cell = cells.get(cellName) ’ only get cell object when need, otherwise we have create the cell object and keep it, then it will be saved to the file.
cell.put_value(cell_value)
[/quote]

@SteveOlson

Please compare row 173:
row173_before.zip (1.3 KB)
row173_saved.zip (734 Bytes)
As the style of the column “JU” is 17, please do not create “JU173” cell object through Cells.get() method, then we can reduce “<c r="JU173" s="17"/>” node.

If you want to minimize, we can export cell names only when need as the following :

  Workbook workbook = new Workbook(dir + "LessFormat_1392.xlsx");
OoxmlSaveOptions saveOptions = new OoxmlSaveOptions(SaveFormat.Xlsx);
  saveOptions.ExportCellName = false;
  workbook.Save(dir + "dest.xlsx", saveOptions);

First off, thanks again, and your support has been excellent!! I would recommend you and this product to anyone!!. Using export_cell_name helped along with save options, I should isolate each change to understand individual impact. And your comment above with the null values probably had the largest impact. cell_value is my local python variable that would be used to fill (create) the cell always, but with the “if check” it does not. With the huge amount of blanks, I believe that contributed to great reduction in size.

I also played around with saving as xlsb… and that really seems to squeeze it the most, but folks may not want that for other reasons) but I could give a choice :grinning:

@SteveOlson,

Thanks for the thoughtful feedback and details. We really appreciate it. We are glad to hear that the changes, especially around handling null values and using export_cell_name, made a noticeable impact. It’s good you’re isolating each tweak to measure its effect by yourselves which is a smart approach especially with large datasets. Also, interesting insight on saving as .xlsb, it definitely has compression or other benefits.

Have a good day!

@SteveOlson
Our new version 25.8 now supports inheritance from interfaces. Please download it.

Wow, thank you! I will do that. I might not have time to drill in right away but I will report back any findings once I am able.

@SteveOlson,

You are welcome. Whenever you find the time, please take a moment to evaluate the feature using the new version. We hope it will meet your needs.