Remove blank rows from worksheet

I have gone through a number of discussion around how can we delete all the rows which may have formula but the data is missing. Sometimes users drag the formula to the rows where data is never populated. For example the maxDataRow gives 1000 rows however there is data in the all only for 50 rows.
I tried using DeleteBlankRow but this did not help. Can you help me with any alternate API I can use in java?

@shahsidd1,

Could you please share a sample Excel file from which you need to remove the blank rows, excluding the 50 rows? Also, give details on which specific rows you want to remove in the worksheet. We will evaluate the file and assist you in accomplishing this task. If there is no logical rule for removing specific “blank” rows, you might require to manually evaluate and remove each unwanted row one by one in the code. Once we receive your sample Excel file and details on which rows you need to remove, we will formulate and potentially devise code to do so.

PS. Please zip the Excel file before attaching it here.

Thanks Amjad,

Attached is the file. It contains data till 11 row however there are formulas till 500 rows
truncate.zip (9.9 KB)

We only need to keep 11 rows where the data is visible. All the other rows should be deleted.

@shahsidd1,

Thanks for the template XLSX file.

For your specific requirements, you may check cells one by one in rows to evaluate blank rows/columns and then either remove or set cells value to blank. To check empty (blank) rows, you may iterate every initialized Row in RowCollection and for every Row, you need to iterate cells in Row iteration. Evaluate cell’s type if it is null or blank. See the following sample code to accomplish your task for your reference.
e.g.
Sample code:

Workbook workbook = new Workbook("g:\\test2\\truncate.xlsx");
Worksheet worksheet = workbook.Worksheets[0];

workbook.CalculateFormula();

RowCollection rows = worksheet.Cells.Rows;
System.Collections.IEnumerator iter = rows.GetEnumerator();
  while (iter.MoveNext())
  {
       Row row = (Row)iter.Current;
       //Get the enumerator from Cells collection
       System.Collections.IEnumerator cellEnumerator = row.GetEnumerator();
       //Traverse cells in the collection
       while (cellEnumerator.MoveNext())
       {
           var cell = (Aspose.Cells.Cell)cellEnumerator.Current;                       
           if(cell.Type == CellValueType.IsString)
           {
               if(cell.Type == CellValueType.IsNull || (String)cell.Value =="")
               {
                   cell.PutValue("");
               }
           }
           
       }
   }

// Delete all blank rows and columns which do not contain any data.
worksheet.Cells.DeleteBlankRows();
worksheet.Cells.DeleteBlankColumns();

workbook.Save("g:\\test2\\out1.xlsx");

Hope, this helps a bit.

@shahsidd1
We will look into whether we can process those cells( which values of cells’s formula are empty) as blank later.
We have logged it with issue id CELLSNET-55453

Thanks Amjad and Simon,

Amjad,

The code does work but has performance impact. For a sheet with around 10 lakh records we see iteration taking considerable time.

We also have done some fixes on our code side to avoid deleting empty rows in between 2 data rows else it will impact calculation on the dependent sheets.

Thanks one again for the help

@shahsidd1
We will add option to check whether a formula cell is empty when deleting blank rows, then we do not need to set null to those cells again. It will improve performance.
But if there are too many records, this method still will consume much time for we have to compare each row and shift rows.

@shahsidd1

This is to inform you that your issue (logged as “CELLSNET-55453”) has been resolved. The enhancement/fix will be included in our upcoming release (Aspose.Cells v24.4) that we plan to release in this week. You will be notified when the next version is released.

We add one new class DeleteBlankOptions to support your needs:
e.g.

DeleteBlankOptions dopts = new DeleteBlankOptions();
dopts.EmptyFormulaValueAsBlank = true;
cells.DeleteBlankRows(dopts); 

The issues you have found earlier (filed as CELLSNET-55453) have been fixed in this update. This message was posted using Bugs notification tool by johnson.shi