Hi, we recently started experiencing a performance issue in a section of our code after upgrading to Aspose-Cells for Java 23.10. What we’re doing is creating an HTML input stream (using non-Aspose code) and trying to save it as XLSX. The performance problem is in the “new Workbook()” call, not the subsequent save() we do right afterward. (Which I find surprising… I’d expect a constructor to be fast, and a save() to be slow.) It used to create the Workbook instance in less than 1 second, now it takes at least 40 seconds for the sample contents.
I did some debugging and experimenting, and found it worked great in 23.1, and degraded in 23.6. Releases 23.2 through 23.5 give various index out of bounds and OOME errors from the constructor code and thus could not be tested for performance.
Our code is much larger than could be attached here, but I did isolate the functionality to a standalone code snippet and an HTML file to feed to it. Our actual code generates this HTML file from a reporting system that reads the rows from a database and runs it through a report template, and generates this HTML to feed to Aspose. But, we can’t include that here. So, I extracted what HTML it generated and put it in a file.
The standalone code snippet that shows the issue is:
public class MainExcelExport {
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("excelinput.html");
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream()) {
LoadOptions loadOptions = new LoadOptions(LoadFormat.HTML);
System.err.println("DEBUG: here 1, time = " + new Date());
// This will be extremely slow from 23.6 onward, is fast in 23.1
Workbook workbook = new Workbook(inputStream, loadOptions);
System.err.println("DEBUG: here 2, time = " + new Date());
// Save it - this is always fast
workbook.save(outputStream, SaveFormat.XLSX);
System.err.println("DEBUG: here 3, time = " + new Date());
// Write it out to XLSX file
byte[] byteArray = outputStream.toByteArray();
try (OutputStream fileOutputStream = new FileOutputStream("output.xlsx")) {
fileOutputStream.write(byteArray);
}
System.err.println("DEBUG: here 4, time = " + new Date());
}
catch (Exception ex) {
System.err.println("Error happened: " + ex.getMessage());
}
}
}
Our sample input file is:
excelinput.html.zip (24.9 KB)
I also found performance degrades exponentially based on input size. Double the number of rows, and you get quadruple the time it takes to create the Workbook.
Are there any workarounds? And, can you please look into this to consider it as an Aspose defect? If there are things we can do to the HTML stream (other than report less data – can’t do as we have no control over what our customers need to report on) that may be possible workarounds for the issue, please let us know. But, please keep in mind this is generated HTML based on a report template (of which many are out there in the wild out of our control) and the specifics of the customer’s data. So, we’re limited in what options we have for changing what’s generated into this HTML.)
Thanks very much for any help!