Dynamic array formula does not populate spilled cells

Hi,

I’m trying to use the dynamic arrays feature of Aspose.Cells for .NET, but reading the value of the spilled cells returning null values. Also, formulas using the spilled area just uses the first value of the array.

See the Excel files below that I populate with the following code snippet:

var workbook = new Workbook("./Book1.xlsx");
var worksheet = workbook.Worksheets.First(ws => ws.Name == "Sheet1");
var random = new Random((int)DateTimeOffset.UtcNow.Ticks);
var expectedValue = 0;
for (int i = 0; i < 1000; i++)
{
	var value = random.Next(1, 1500);
	worksheet.Cells[$"A{i + 1}"].Value = (char)('A' + (i % 4));
	worksheet.Cells[$"B{i + 1}"].Value = value;
	if (i % 4 == 0)
	{
		expectedValue += value;
	}
}

worksheet.Cells["A1001"].Value = "End";
var ws2 = workbook.Worksheets.First(ws => ws.Name == "Sheet2");
workbook.RefreshDynamicArrayFormulas(false);
workbook.CalculateFormula();

Console.WriteLine($"Expected: {expectedValue}");
Console.WriteLine($"Actual: {ws2.Cells["D1"].Value}");
Console.WriteLine($"{ws2.Cells["C2"].Value}");

workbook.Save("output.xlsx");

Sheet2 checks the content of Sheet1, and creates a dynamic reference to an array based on where the End value is in column A, and collects those numbers that are next to “A”, and sums them.

When opening the output, I see the same result that I get from the program as well:
image.png (2.0 KB)

However, when I put the refresh the formula in cell C1 to the same value as before, I get the following:
image.png (8.8 KB)

Output.xlsx is the original output of the above code snippet, and output_fixed.xlsx is the one that should be the expected result.
sample_excels.zip (43.0 KB)

Can you please advise?

Regards,
Marton

@varsanyi.marton,

Thanks for the sample files and details.

Could you please also zip and attach “Book1.xlsx” file, it will help us to evaluate your issue precisely on our end.

@Amjad_Sahi

See here: Book1.zip (7.7 KB)

Thanks,
Marton

@varsanyi.marton,

Thanks for the template file.

Please notice, after an initial test, I am able to reproduce the issue as you mentioned by using your template file. I confirmed reading the value of the spilled cells return null values. Also, formulas using the spilled area just uses the first value of the array. I have logged a ticket with an id “CELLSNET-51716” for your issue. We will investigate and look into the details of the issue. Hopefully, we could sort it out soon.

Once we have an update on it, we will let you know.

1 Like

@varsanyi.marton,

We have evaluated your issue in details.
Because the dynamic array formulas in the template file depend on the calculated results of other normal cell formulas, and those normal cell formulas also depend on the newly updated cells, so before refreshing the dynamic array formulas, other formulas need to be calculated at first. Currently, please change the sample code as:

...
wb.CalculateFormula();
wb.RefreshDynamicArrayFormulas(false);
wb.CalculateFormula();
...

And, later on, we will investigate whether we can make this process simpler for users or not.

@Amjad_Sahi

Hi, thank you for the workaround. This indeed works for the attached spreadsheet, but if there are any cells that are trying to consume the value of calculated cells with other formulas, it stops working.

I have changed the sample code to the following:

var workbook = new Workbook("./Book1.xlsx");
var worksheet = workbook.Worksheets.First(ws => ws.Name == "Sheet1");
var random = new Random((int)DateTimeOffset.UtcNow.Ticks);
var expectedValue = 0;
for (int i = 0; i < 1000; i++)
{
	var value = random.Next(1, 1500);
	worksheet.Cells[$"A{i + 1}"].Value = (char)('A' + (i % 4));
	worksheet.Cells[$"B{i + 1}"].Value = value;
	if (i % 4 == 0)
	{
		expectedValue += value;
	}
}

worksheet.Cells["A1001"].Value = "End";
workbook.CalculateFormula();
workbook.RefreshDynamicArrayFormulas(false);
workbook.CalculateFormula();

Console.WriteLine($"Expected: {expectedValue}");
Console.WriteLine("Reading data from the Sheet2 sheet");
var ws2 = workbook.Worksheets.First(ws => ws.Name == "Sheet2");
Console.WriteLine($"Actual: {ws2.Cells["D1"].Value}");
Console.WriteLine($"C2: {ws2.Cells["C2"].Value}");

Console.WriteLine("Reading data from the output sheet");
var outputSheet = workbook.Worksheets.FirstOrDefault(ws => ws.Name == "Output");
if (outputSheet != null)
{
	var sum = outputSheet.Cells["C1"].IntValue;
	var count = outputSheet.Cells["B2"].IntValue;
	List<int?> output = new List<int?>(count);
	for (int i = 0; i < count; i++)
	{
		var cell = outputSheet.Cells[1, (int)('C' - 'A' + i)];
		output.Add(cell.IsNumericValue ? cell.IntValue : null);
	}

	Console.WriteLine($"Actual: {sum}");
	Console.WriteLine($"List: {string.Join(", ", output)}");
}

workbook.Save("output.xlsx");

If I remove the output tab from the Book1.xlsx, then it starts working again. Here are the spreadsheets compressed into a zip file: sample_excels.zip (54.1 KB)

I can reproduce this issue with 22.7.0 and 22.8.0 version of Aspose.Cells.

I’m looking forward to see the permanent fix of this issue.

Regards,
Marton

@varsanyi.marton,

Thanks for the new test case with new file.

You are right it stopped working again. It does not work (as I tested) when there are some cells that are trying to consume the value of calculated cells with other formulas. I have logged your test case with template file to your existing ticket “CELLSNET-51716”. We will look into it soon.

Once we have an update on it, we will let you know.

@varsanyi.marton,

This is to inform you that your issue has been resolved. The fix will be included in our upcoming release (Aspose.Cells v22.9) scheduled in the second week of September 2022. You will also be notified when the next version is released.

The issues you have found earlier (filed as CELLSNET-51716) have been fixed in this update. This message was posted using Bugs notification tool by johnson.shi

Hi @Amjad_Sahi

I have checked the updated version of the library. It seems to be working, but I need to use the following three statements to get the proper results back. Is it how it supposed to work? I’m still using the same sample Excel files and code that I have provided earlier.

workbook.CalculateFormula();
workbook.RefreshDynamicArrayFormulas(false);
workbook.CalculateFormula();

This makes the calculation slower (7-10 ms -> 30+ ms) in our final Excel file than it can be achieved by using Ctrl+Shift+Enter formulas. I understand that evaluating these dynamic array formulas is adding some complexity, but needing to call these 3 methods seems unnecessary and not really straightforward.

Thanks,
Marton

@varsanyi.marton,

Thanks for sharing your concerns.

Apparently, these 3 lines are compulsory in sequence to make it work precisely. Let us evaluate the performance measures if it is expected time frame or there is a space for further improvements. We will get back to you after further evaluation.

@varsanyi.marton,
From 22.9 we provide new method Workbook.RefreshDynamicArrayFormulas(bool calculate, CalculationOptions copts). One of the abilities this method can provide to user is to control whether calculate formulas recursively when refreshing the dynamic array formulas. You may try to replace the three lines to two like:
workbook.RefreshDynamicArrayFormulas(false, new CalculationOptions());
workbook.CalculateFormula();

Hi @johnson.shi

The performance is indeed better, but I still don’t think that this API is straightforward to use this way.

Regards,
Marton

@varsanyi.marton,
Thank you for the feedback. We may consider to integrate those two steps into one method call later, such as Workbook.Workbook.RefreshDynamicArrayFormulas(CalculationOptions copts, int flags) where the flags may denote whether do calculation while refreshing, if true then whether calculate the whole workbook or only those refreshed dynamic array formulas.
If you have many dependencies like this in the workbook and you need to update them repeatedly, we think you may also consider to enable the formula calculation chain for the workbook(Workbook.Settings.FormulaSettings.EnableCalculationChain = true).

@johnson.shi

thank your for the suggestion. We load the Excel file into the memory once, and then when we need to calculate some formulas, we copy it into another one using the workbook.Copy method to make sure that concurrent requests will not conflict with each other. I did some testing and it does not seem that the Aspose.Cells can copy the model from one workbook to another, so every time I call these two methods to get back the results, it seems that this flag does not help us because it builds the model on every calculation. Are my assumptions correct? Do you have any suggestion to make it more performant?

Thanks,
Marton

@varsanyi.marton,
We do not find this issue. When copying the workbook, the calculation chain also will be copied to the new Workbook. Please make sure you have enabled the calculation chain for your source workbook and the workbook has been calculated(to build the calculation chain).

However, because you need to copy base data to new workbook every time, we don’t think the calculation chain may help much for the performance unless most formulas in the original workbook do not depend on those cells that you will update in the new workbook. Dependencies between formulas influence the performance of refreshment and calculation. The more complicated the dependencies are, the more time will be required to refresh dynamic array formulas and calculate formulas. The new method we talked about in previous post may make the code simpler, and for some situations it also may help to improve the performance a bit. But we don’t think it can give significant difference for the performance if there are too many formulas need to be re-calculated.