Visit public Cell this[int row, int column] of class `Cells` will enlarge the sheet

using Aspose.Cells;

namespace TestExcelFormat
{
internal class Program
{
static void Main(string[] args)
{
Workbook wb = new Workbook(@“1.xlsx”);
Console.WriteLine(wb.Worksheets[0].Cells.MaxColumn);
var cell = wb.Worksheets[0].Cells[0, 2];
Console.WriteLine(wb.Worksheets[0].Cells.MaxColumn);
wb.Save(@“2.xlsx”);
}
}
}

unzip 2.xlsx and open sheet1.xml we can see:

row r=“1” spans=“1:3” ht=“13.9”

But this cell has no value, no format.
This will impact downstream toolchains, e.g., using openpyxl to open 2.xlsx will show the sheet contains 3 columns.

@cnbjw

Cause

The issue you’re encountering is related to how Aspose.Cells handles empty cells. When you access a cell using the Cells indexer, it creates a cell object even if it doesn’t contain any value or formatting. This can lead to the appearance of additional columns in the resulting Excel file, which may not be desirable for downstream applications.

Solution

To prevent the creation of empty cells from affecting the column count in your output file, you can check if the cell has any value or formatting before accessing it. If it doesn’t, you can choose to skip it or handle it differently.

Code

Here’s a modified version of your code that checks for cell content before accessing it:

using Aspose.Cells;

namespace TestExcelFormat
{
    internal class Program
    {
        static void Main(string[] args)
        {
            Workbook wb = new Workbook(@"1.xlsx");
            Console.WriteLine(wb.Worksheets[0].Cells.MaxColumn);
            
            // Access the cell at (0, 2)
            var cell = wb.Worksheets[0].Cells[0, 2];
            
            // Check if the cell has a value or formatting
            if (cell.Value != null || cell.GetStyle() != null)
            {
                Console.WriteLine("Cell has value or formatting.");
            }
            else
            {
                Console.WriteLine("Cell is empty, skipping.");
            }

            Console.WriteLine(wb.Worksheets[0].Cells.MaxColumn);
            wb.Save(@"2.xlsx");
        }
    }
}

Next Steps

  1. Run the modified code to confirm that it resolves the issue with the additional columns.
  2. If you still encounter problems, check the original Excel file for any hidden or merged cells that might be affecting the output.
  3. Test the output file with your downstream tools to ensure compatibility.

Please let me know if you need further assistance!

Sample.7z (12.6 KB)

In some situations, visiting an empty(nonexisting) cell is Inevitable. For example GetDependentsInCalculation, I know this cell is beyand maxrow/maxcol, but I must get the Cell object to call GetDependentsInCalculation.

Is there any save options that can shrink the sheet automatically?(When I open and resave 2.xlsx with microsoft excel, it will return to normal size)

@cnbjw
When using Cells[row, column] to access a cell, if the cell does not exist, a new cell will be created. If you only want to check if this cell exists, please use the Cells.CheckCell method.

This ​​still doesn’t solve the problem.​​ ​​The user​​ might ​​enter​​ something ​​into​​ a cell ​​outside the current range​​ and then ​​undo​​ it later. ​​In that case, the saved file will contain entirely empty rows or columns.​​ I think the best solution ​​would be to offer an option​​ to ​​automatically​​ remove ​​these entirely empty rows and columns​​ when saving.

@cnbjw
Due to the fact that users can add and modify cells at any time, dynamically checking and deleting empty rows and columns can be a very time-consuming and performance consuming check. Please manually call the methods to delete empty rows and columns as needed. Please refer to the following methods for deleting rows and columns.

DeleteBlankColumns(DeleteOptions)
DeleteBlankRows()
DeleteBlankRows(DeleteOptions)
DeleteColumn(int)
DeleteColumn(int, bool)
DeleteColumns(int, int, bool)
DeleteColumns(int, int, DeleteOptions)
DeleteRow(int)|Deletes a row.|
DeleteRow(int, bool)
DeleteRows(int, int)
DeleteRows(int, int, bool)
DeleteRows(int, int, DeleteOptions)

image.png (47.6 KB)

In sheet1.xlm of 2.xlsx
I think the spans value is wrong. It contains Cells that is temporaryly created in runtime but has nothing in it. These runtime processes should not be manifested in the stored files.

Thank you! These functions seem helpful. I will try these later.

@cnbjw
Thank you for your feedback. You are welcome. Please take your time to try the suggested methods. Hopefully, your issue will be sorted out. Please let us know your feedback.

@cnbjw

It is designed that Cells[row,column] will instantiate the Cell object in cells model if this Cell does not exist yet. To visit and check existing Cell objects, using enumerator(such as Cells.GetEnumerator) is the most efficient way.

For those apis that may be used on “null” cell, we provide corresponding apis on Cells for most of them. For your case of checking dependents, you may use Cells.GetDependents and GetDependentsInCalculation instead.

Hope that may help a bit for your requirement.