Calculation time increasing issue

Hi,


Attached template has ~1000 data rows that contain formulas. It was created using Aspose Cells and has setting for transition formula evaluation on (Transition Formula Evaluation support).
We use the following code to get calculated values for the cells:

<pre style=“font-family: “Courier New”; background-color: rgb(255, 255, 255);”>@Test
public void calcTest() throws Exception {
Workbook wb = new Workbook(“C:/temp/calc_template.xlsx”);
long t = System.currentTimeMillis();

Worksheet ws = wb.getWorksheets().get(0);
Cells cells = ws.getCells();
int lastRow = Math.max(0, cells.getMaxDataRow());
int lastCol = Math.max(0, cells.getMaxDataColumn());
int formulasCount = 0;
for (int rowIndex = 0; rowIndex < lastRow; rowIndex++) {
for (int colIndex = 0; colIndex < lastCol; colIndex++) {
String formula = cells.get(rowIndex, colIndex).getFormula();
if (formula != null && !formula.isEmpty()) {
// Prepare cell to get calculated value
Cell cell = getCellForRead(wb, cells, rowIndex, colIndex);
// Read cell value
String value = cell.getDisplayStringValue();
formulasCount++;
}
}
}

System.out.println("Total calc time: " + (System.currentTimeMillis() - t) / 1000 + “s”);
System.out.println("Total formulas: " + formulasCount);
}
<pre style=“font-family: “Courier New”; background-color: rgb(255, 255, 255);”>private Cell getCellForRead(Workbook wb, Cells cells, int row, int col) throws Exception {
Cell cell = cells.get(row, col);
prepareCellToRead(wb, cell);
return cell;
}

private void prepareCellToRead(Workbook wb, Cell cell) throws Exception {
if (wb.getSettings().getCalcMode() == CalcModeType.AUTOMATIC) {
if (cell.getFormula() != null) {
ReferredAreaCollection precedents = cell.getPrecedents();
if (precedents != null) {
for (int i = 0; i < precedents.getCount(); i++) {
ReferredArea area = precedents.get(i);
if (area.getExternalFileName() != null) {
throw new Exception(“Formula refers to an external source”);
}

Cells areaSheetCells = wb.getWorksheets().get(area.getSheetName()).getCells();
int endRow = area.getEndRow() > areaSheetCells.getMaxRow() ? areaSheetCells.getMaxRow() : area.getEndRow();
int endColumn = area.getEndColumn() > areaSheetCells.getMaxColumn() ? areaSheetCells.getMaxColumn() : area.getEndColumn();

if (area.isArea()) {
for (int row = area.getStartRow(); row <= endRow; row++) {
for (int col = area.getStartColumn(); col <= endColumn; col++) {
prepareCellToRead(wb, areaSheetCells.get(row, col));
}
}
} else {
prepareCellToRead(wb, areaSheetCells.get(area.getStartRow(), area.getStartColumn()));
}
}
}
cell.calculate(true, null);
}
}
}

<pre style=“font-family: “Courier New”; background-color: rgb(255, 255, 255);”><span style=“font-family: “Times New Roman”; white-space: normal;”>Using Aspose Cells version 8.7.1 output looks as follows:<pre style=“font-family: “Courier New”; background-color: rgb(255, 255, 255);”>
Total calc time: 6s
Total formulas: 2384
<span style=“font-family: “Times New Roman”; white-space: normal;”>
<span style=“font-family: “Times New Roman”; white-space: normal;”>Using Aspose Cells version 8.7.2 and above (including latest 8.9.0) output looks as follows:
<div style=“font-family: “Times New Roman”; white-space: normal;”>
Total calc time: 266s
Total formulas: 2384

Calculation time is significantly increased. It seems that new custom calculation engine that was added in version <span style=“font-family: “Times New Roman”; white-space: normal;”>8.7.2 is the result of the problem. Release notes of 8.7.2 version state the following:

Adds CalculationOptions.CustomEngine property.

Allow user to use the new custom calculation engine to calculate formulas.

<span style=“font-family: “Times New Roman”; white-space: normal;”>…
<span style=“font-family: “Times New Roman”; white-space: normal;”>
Could you scrutinize the issue and fix it?

Thanks.
<blockquote style=“font-family: “Times New Roman”; white-space: normal; margin: 0px 0px 0px 40px; border: none; border-image-source: initial; border-image-slice: initial; border-image-width: initial; border-image-outset: initial; border-image-repeat: initial; padding: 0px;”>

Hi Anvar,


Thank you for sharing the sample.

I have raised your concerns with the product team by logging an investigative ticket as CELLSJAVA-41925. Please spare us little time to properly analyze the case and revert back with updates in this regard.

Output:
API Version: 8.9.0.0
JDK Version: 1.8.0_66
Total calc time: 158s
Total formulas: 2384

API Version: 8.7.1.0
JDK Version: 1.8.0_66
Total calc time: 4s
Total formulas: 2384

Hi,


We have evaluated your issue further.
Well, it is because we have changed the logic of Cell.Calculate() method. In older versions, we did not reset other cells’ calculation flag when calculating one cell. For example, A1’s formula is =A2+A3, A2’s formula is =A4, when calling Cell.Calculate() for A1, if A2 has not been calculated, then it will be calculated at first. Thus A2’s calculation flag will be marked as “calculated” after the Cell.Calculate() method returned. When you call this method again, A2 will not be calculated.

However, as you know, Cell.Calculate() method is a standalone process and user can call it anywhere anytime. Sometimes user may change A4’s value and then want to get the new calculated result of A1. But with the old logic A1.Calculate() cannot give the correct value because A4 has no chance to be re-calculated. According to other users’ requirements for such kind of situation, we changed the logic to reset all cells’ calculation flag before calculating one cell in Cell.Calculate() method.

For your case, we can see you have your own logic to trace the cell references and dependencies. Such kind of logic is suitable for the older logic of Cell.Calculate(). However, with the new logic of Cell.Calculate(), when one cell being calculated, all other cells it depends will be re-calculated again. For one deep chain, such as A1 refers to A2, A2 refers to A3, A3 refers to A4, in your case, older versions will calculate A4, then A3, then A2 and then A1. But with new versions, A4 will be calculated at first, then when calculating A3, A4 will be calculated again. And when calculating A2, A3 and A4 will be calculated again. It is just the duplicated calculations that caused the time cost increased.

For your case, we think you can just call Workbook.CalculateFormula() method once for all cells and then process those cells.

Thanks for your understanding!

Hi,


Thanks for analysis.Just to clarify.

We have big Excel document. Many columns was added dynamically Using Aspose Cells API as well as many rows. And the provided sample just small part of big document. Also we use many custom functions. The reasonable choice is recalculation of some changed cells with Excel formulas. And it will take too many time to recalc the entire document in case of call Workbook.CalculateFormula() that is why we cannot just call Workbook.CalculateFormula().

And using Aspose Cells version 8.7.1 we have great time of processing just six seconds.

We attached new tempalted new_calc_template.xlsx and modified the sample code according to your suggestions, the following code is for new logic of Cell.Calculate() using Aspose Cells version 8.9.0:

<pre style=“font-family: “Courier New”; background-color: rgb(255, 255, 255);”>@Test
public void calcTest() throws Exception {
Workbook wb = new Workbook(“C:/temp/new_calc_template.xlsx”);
long t = System.currentTimeMillis();

Worksheet ws = wb.getWorksheets().get(0);
int formulasCount = 0;
Cells cells = ws.getCells();
int lastRow = Math.max(0, cells.getMaxDataRow());
int lastCol = Math.max(0, cells.getMaxDataColumn());
int calcMode = wb.getSettings().getCalcMode();
for (int rowIndex = 0; rowIndex < lastRow; rowIndex++) {
for (int colIndex = 0; colIndex < lastCol; colIndex++) {
String formula = cells.get(rowIndex, colIndex).getFormula();
if (calcMode == CalcModeType.AUTOMATIC && formula != null) {
// Prepare cell to get calculated value
Cell cell = getCellForRead(cells, rowIndex, colIndex);
// Read cell value
String value = cell.getDisplayStringValue();
formulasCount++;
}
}
}

System.out.println("Total calc time: " + (System.currentTimeMillis() - t) / 1000 + “s”);
System.out.println("Total formulas: " + formulasCount);
}

<pre style=“font-family: “Courier New”; background-color: rgb(255, 255, 255);”>private Cell getCellForRead(Cells cells, int row, int col) throws Exception {
Cell cell = cells.get(row, col);
cell.calculate(true, null);
return cell;
}
Output looks as follows:
Total calc time: 54s
Total formulas: 2383

It still takes like sixty seconds that is significatly increasing from six seconds in comparison with Aspose Cells version 8.7.1. The regression is obvious.

Could you scrutinize the problem and optimize it?

Thanks.

Hi,


Could you try our latest version/fix: Aspose.Cells for Java v8.9.0.3 (attached).
I have tested your scenario/ case in a simple JAVA console program using your newly attached template file with your sample code and it works fine. It takes 3s to process the 2383 formulas on my pc (OS: Windows 8, CPU: i7, 8GB (RAM), JDK:1.7.x, etc.).

Let us know if you still have any issue with v8.9.0.3.

Thank you.

Hi,


I have tried the latest Aspose.Cells for Java v8.9.0.3, but still it takes sixty seconds. I have ran multiple times but never gotten three-five seconds.

Could you make a couple of runs on your side to reproduce the problem?

Thanks.

Hi,


Could you make sure that you are using latest fix/version of the product: Aspose.Cells for Java v8.9.0.3 (I have already attached the fix in my previous reply in the thread), you may print the version number for confirmation. I have again tried your scenario/ case and execute the JAVA program twice, it gives similar results, see the screen shots for your reference:
http://prntscr.com/by2vga

By the way, I am using JDK1.7.x, do you use JDK1.8?

Thank you.
Hi,

I have tried both Java 1.7 & 1.8 and have the same results - sixty seconds.

Did you open new_calc_template.xlsx using MS Excel before test runs?

It is important to not open new_calc_template.xlsx using MS Excel before test running, because it modifies the template.

Could you provide the exact verion of Java 1.7.x, I mean 'x' release?

Thanks.

Hi,


Well, I did not open the “<span style=“color: rgb(0, 128, 0); font-weight: bold; font-size: small; font-family: “Courier New”; background-color: rgb(255, 255, 255);”>new_calc_template.xlsx” file into Ms Excel manually prior testing your scenario/ case. It looks strange to us as it takes more time on your side. Could you print the version number of Aspose.Cells for JAVA at the start of the test to confirm you are using the correct JAR file:
e.g
Sample code:

System.out.println(CellsHelper.getVersion());

I use a simple console JAVA program to test your scenario/case. We will also try to test your scenario/ case on different machines with different OS (if possible) if we could reproduce the issue on our end.

Thank you.

Hi,


We have found the command line args to reporoduce the issue, here is:

java -agentlib:jdwp=transport=dt_socket,address=50379,server=y,suspend=n -cp .;aspose-cells-8.9.0.3.jar CalculationTime

The output looks as follows:
Listening for transport dt_socket at address: 50379
Aspose 8.9.0.3
Total calc time: 57s
Total formulas: 2383

We had not this problem using Aspose Cells version 8.7.1.
java -cp .;aspose-cells-8.7.1.jar CalculationTime

java -agentlib:jdwp=transport=dt_socket,address=50379,server=y,suspend=n -cp .;aspose-cells-8.7.1.jar CalculationTime

Both outputs (with & and without debug) look the same:
Listening for transport dt_socket at address: 50379
Aspose 8.7.1.0
Total calc time: 5s
Total formulas: 2383

We develop application using Aspose library and use debug mode that is why debug command line args have been added implicitly by IDE. It will be difficult to debug with waiting for long time in case of big report processing.

Could you review & fix this issue?

Thanks.

Hi,


Thanks for sharing the details.

You are right. If we use the command line args in debug mode using latest version/fix v8.9.0.3, the calculation time increases considerably. I can reproduce the issue with your suggested command line arguments when running my JAVA program (I used last time), see the following screenshot:
http://prntscr.com/bykoxm

I have logged your concerns against your issue “CELLSJAVA-41925” into our database. We will look into it soon.

Once we have an update on it, we will let you know here.

Thank you.

Hi,


It is very important issue to us.

Could you provide any feedback for the resolution of the problem?

Thanks.

Hi,

Thanks for your posting and using Aspose.Cells.

We are afraid there is no update for you at this moment regarding this issue. However, we have logged your comment in our database against this issue and requested the product team to provide some fix or ETA for this issue. Once there is some news for you, we will update you asap by posting in this thread.

Hi,


Do you have any feedback for us?

Thanks.

Hi,

Thanks for your posting and using Aspose.Cells.

We are working over your issue but there is no ETA available at this moment. Please spare us this week and check back on Monday (8-Aug). If your issue is not fixed then, we will log your comment again for product team. In the meanwhile, if any fix is available for you, we will let you know by posting in this thread.

Hi,

Thanks for using Aspose.Cells.

It is to update you that we are considering to provide one
option for calculating formulas to avoid calculating formulas
recursively. This option can make your old program to run in the same way
as with older versions.

Hi,


There is no need to have this option for backward compatibility. We have re-written the code according to new library logic. Everthing works well in production. Now the only one problem that we have is debugging. As mentioned from above post run only in debug mode is still slow. I have already given command line args to reporoduce the issue, here is:

java -agentlib:jdwp=transport=dt_socket,address=50379,server=y,suspend=n -cp .;aspose-cells-8.9.0.3.jar CalculationTime

The output looks as follows:
Listening for transport dt_socket at address: 50379
Aspose 8.9.0.3
Total calc time: 57s
Total formulas: 2383

Without debugging processing time looks grade and it takes 2-5 seconds.

Could you fix slowness just for debug mode?

Thanks.

Hi,


Thanks for providing us further details.

I think your concerns make sense. I have logged your comments against your issue “CELLSJAVA-41925” into our database. Our concerned developer from product team will evaluate it and we will try to fix the issue as it only occurred in debug mode using your suggested command lines.

Once we have an update on it, we will let you know here.

Thank you.

Hi,


Have you any news for us?

Thanks.

Hi,

Thanks for your posting and using Aspose.Cells.

We are afraid there is no update for you at this moment regarding this issue. However, we have logged your comment in our database against this issue and requested the product team to provide some fix or ETA for this issue. Once there is some news for you, we will update you asap by posting in this thread.