Saving workbook after reading with LightCellsDataHandler adds extra cells


#1

Hi,

Using Aspose Cells for Java, version 19.8.6.

We are using LightCellsDataHandler to limit the row range while reading the workbook, as in our use-case, the workbooks can be enormous, and lead to out-of-memory conditions. Initially, things seem to work as expected: Aspose Cells reports that the maximum row of the workbook matches the limit that we enforce in our handler.

However, saving the workbook does not work as expected, and seems to alter the maximum row, which is scary. One would expect that saving the workbook does not mutate any in-memory state.

Sample code:

public class Demo {
    public static void main(String[] args) throws Exception {
        LoadOptions loadOptions = new LoadOptions();
        loadOptions.setLightCellsDataHandler(new MyLightCellsDataHandler());
        Workbook workbook = new Workbook("input.xlsx", loadOptions);
        WorksheetCollection worksheets = workbook.getWorksheets();
        Worksheet worksheet = worksheets.get("Sheet1");
        Cells cells = worksheet.getCells();

        System.out.println("before save, max data row: " + cells.getMaxDataRow() + ", max row: " + cells.getMaxRow());

        HtmlSaveOptions saveOptions = new HtmlSaveOptions(SaveFormat.HTML);
        saveOptions.setExportActiveWorksheetOnly(true);
        workbook.save("output.html", saveOptions);

        System.out.println("after save, max data row: " + cells.getMaxDataRow() + ", max row: " + cells.getMaxRow());
    }

    private static class MyLightCellsDataHandler implements LightCellsDataHandler {
        @Override
        public boolean startSheet(Worksheet worksheet) {
            return worksheet.getName().equals("Sheet1");
        }

        @Override
        public boolean startRow(int rowIndex) {
            return rowIndex < 10;
        }

        @Override
        public boolean processRow(Row row) {
            return true;
        }

        @Override
        public boolean startCell(int columnIndex) {
            return true;
        }

        @Override
        public boolean processCell(Cell cell) {
            return true;
        }
    }
}

When saving as HTML, a number of rows are added unexpectedly.

But even worse, saving the workbook seems to mutate the state, as cell.getMaxRow() suddenly starts to return a different value than before the save operation.

The console output of the demo program is as following:

before save, max data row: 9, max row: 9
after save, max data row: 9, max row: 30

Kind regards,
Taras

demo.zip (9.4 KB)


#2

@TarasTielkes,
We were able to observe the issue but we need to look into it more. We have logged the issue in our database for investigation and for a fix. Once, we will have some news for you, we will update you in this topic.

This issue has been logged as

CELLSJAVA-43011 – Saving workbook after reading with LightCellsDataHandler adds extra cells