Hello Aspose support team,
I have an issue with Cells#getMaxColumn when XLSB format is used (Java, aspose cells ver 23.1)
XLSB file:
smallreproXlsb.7z (5.9 KB)
Cells#getMaxColumn returns -1 value for a sheet with existing data for A1:G11 range.
Issue occurs when custom LightCellsDataHandler
is set.
Steps to reproduce:
InputStream xlsbInputStream = new BufferedInputStream(new FileInputStream("c:\\dev\\aspose\\smallreproXlsb.xlsb"));
LoadOptions loadOptions = new LoadOptions();
loadOptions.setMemorySetting(MemorySetting.MEMORY_PREFERENCE);
loadOptions.setKeepUnparsedData(false);
loadOptions.setParsingFormulaOnOpen(false);
loadOptions.setLightCellsDataHandler(new LightCellsDataHandler() {
@Override
public boolean startSheet(Worksheet worksheet) {
return true;
}
@Override
public boolean startRow(int i) {
return true;
}
@Override
public boolean processRow(Row row) {
return true;
}
@Override
public boolean startCell(int i) {
return true;
}
@Override
public boolean processCell(Cell cell) {
return true;
}
});
Workbook workbook = new Workbook(xlsbInputStream, loadOptions);
WorksheetCollection sheets = workbook.getWorksheets();
Worksheet worksheet = sheets.get(0);
Cells cells = worksheet.getCells();
System.out.println("Max Row: " + cells.getMaxRow());
System.out.println("Max Column: " + cells.getMaxColumn());
if (cells.getMaxColumn() < 0) {
System.out.println("bug, should be 6");
}
Could you please advice?
Alex
@AlexBBB,
Please note, when using LightCells APIs, generally, cell data will be processed by the implementation of LightCellsDataHandler and disregarded after calling LightCellsDataHandler.processCell(Cell cell). In this way, there is no cell object in memory for the workbook/worksheet after the loading process, so lots of memory can be saved. However, because there is no cell object in Cells collection, the MaxDataColumn/MaxColumn will be returning -1 for the empty collection. If you do want to keep all or some cells in memory, you can let LightCellsDataHandler.processCell(Cell cell) to return true for those cells. If you keep all cells in memory after the loading, then you can get the correct MaxDataColumn/MaxDataRow(with higher memory cost). In fact for performance consideration, we think it is easy and better for you to calculate MaxRow/MaxColumn value by yourself in the handler by yourselves.
Please be aware that same code base works fine for XLSX with identical content. It returns correct column count, we are facing with issue only for XLSB format.
@AlexBBB
Generally if you do not keep those cells in your implementation of LightCellsDataHandler, it is impossible to get the expected data dimension values. Would you please provide us your executable code with template files to reproduce the difference? We will check and figure out the issue when we get the needed resources.
Thank you for reply,
Executable code and xlsx/xlsb files are here:
my.7z (13.1 KB)
Thanks,
Alex
@AlexBBB,
Thanks for the sample files and code snippet.
After an initial testing and using your template XLSX and XSLB files, I found the issue as you mentioned. As per the nature of lightweight mode, there should be no cell object (existed) in Cells collection at the end, so MaxDataRow/MaxRow and MaxDataColumn/MaxColumn should be returning -1. Maybe some values for maximum columns are already stored/cached in case of XLSX file itself.
We need to investigate your issue in details. We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): CELLSJAVA-45679
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
Is there an update of the analysis of this ticket?
@TarasTielkes
We have fixed the issue of Cells.getMaxColumn() for workbook loading from xlsb file with LightCells. The fix will be included into our next official version 23.12 which is scheduled to be released in the second week of December.
@TarasTielkes
By the way, when you always return true for LightCellsDataHandler:
public boolean processCell(Cell cell) {
return true;
}
Then all cells data will be kept in cells model in memory and the result should be same with when you loading the workbook without LightCells. In fact, for such kind of situation it would be better for you to load workbook without LightCells because it may give better performance.
The issues you have found earlier (filed as CELLSJAVA-45679) have been fixed in Aspose.Cells for Java 23.12.