Performance issues with PivotTable.Format

Hi,

I have been using Aspose.Cells to create a report and have run into a performance issue when applying styles to a small number of cells within a pivot table.
The spreadsheet (attached) contains a pivot table with just over 4700 rows, derived from 30,000 rows of test data (present in another worksheet), and attempting to apply format to the cells in the header and grand total rows is very slow.
Reducing the number of data rows also reduces the time taken to apply the format, despite the actual number of cells being formatted remaining the same.
I have included the code I have been using to test this below (using the latest version of Aspose.Cells), and have observed the time taken to format the cells to be:

4731 rows in pivot (30,000 rows in data sheet) - 53 seconds (1 minute 30 seconds in a debug build)
2165 rows in pivot (15,000 rows in data sheet) - 11 seconds
915 rows in pivot (7,500 rows in data sheet) - 2 seconds

Any assistance you can give would be appreciated (code follows)

var book = new Workbook(@“c:\tmp\rpttest\gpr.xlsx”);
var sheet = book.Worksheets[1];
var pivot = sheet.PivotTables[0];

pivot.RefreshData();
pivot.CalculateData();

var area = pivot.TableRange1;

Style overlineStyle = book.CreateStyle();
overlineStyle.Borders[BorderType.TopBorder].LineStyle = CellBorderType.Thick;
overlineStyle.Borders[BorderType.TopBorder].Color = Color.Black;

Style footerStyle = book.CreateStyle();
footerStyle.Borders[BorderType.TopBorder].LineStyle = CellBorderType.Thick;
footerStyle.Borders[BorderType.TopBorder].Color = Color.Black;
footerStyle.Font.IsBold = true;

for (int i = 0; i <= area.EndColumn; i++)
{
pivot.Format(area.StartRow + 2, i, overlineStyle);
pivot.Format(area.EndRow, i, footerStyle);
}


Hi Bob,


Thank you for using Aspose products, and welcome to the Aspose.Cells support forum.

We have evaluated your presented scenario on our and with your provided spreadsheet and code snippet against the latest build of Aspose.Cells for .NET 8.0.2.3. We believe your code is already optimized for performance although changing it a little as elaborated below yields 50% performance improvement.

C#

Workbook book = new Workbook(myDir + “GPR.xlsx”);
var sheet = book.Worksheets[1];
var pivot = sheet.PivotTables[0];

pivot.RefreshData();
pivot.CalculateData();

var area = pivot.TableRange1;

Style overlineStyle = book.Styles[book.Styles.Add()];
overlineStyle.Borders[BorderType.TopBorder].LineStyle = CellBorderType.Thick;
overlineStyle.Borders[BorderType.TopBorder].Color = Color.Black;

Style footerStyle = book.Styles[book.Styles.Add()];
footerStyle.Borders[BorderType.TopBorder].LineStyle = CellBorderType.Thick;
footerStyle.Borders[BorderType.TopBorder].Color = Color.Black;
footerStyle.Font.IsBold = true;

//Retrieving these values in a for loop forces the API to calculate the
//PivotTable area every time loop executes, therefore store them temporarily
int startRow = area.StartRow;
int endRow = area.EndRow;
int endCol = area.EndColumn;

for (int i = 0; i <= endCol ; i++)
{
pivot.Format(startRow + 2, i, overlineStyle);
pivot.Format(endRow, i, footerStyle);
}


Please note, we have performed these tests on a Core i7 Q740 (1st Generation) with 6GB of RAM having Windows 7 Home Premium 64-bit platform. Using your original code snippet the overall process took a little more that 4 mins, whereas with above modifications the execution time is now under 2 mins. Please give it a try on your side, and keep us posted with your results.

Hi,

Thanks for getting back to me, and for your suggestion.

I did get a slight boost simply by using the version of Aspose.Cells you mentioned (it now takes 49 seconds down from the 53 I had using 8.0.2.0), but unfortunately storing the row and col values in separate variables as you suggested did not have any impact on the speed when I tried it.

I have been testing using a Core i5 @3.20GHz with 8GB of RAM, Windows 7 Pro, 64-bit, so I am a little surprised that your timings were so much slower than mine.

It also seems a little odd that the time taken to apply format to the cells in the pivot table seems proportional to the size of the pivot, even though the format is applied to the same number of cells in each case.
As mentioned in my original post if I roughly half the number of rows in the pivot table, the time taken drops from 49 seconds to 10 seconds. This holds true even after caching the column and row values as per your suggestion.

For comparison, I tested using Cell.SetStyle in a similar for-loop to apply format to the same number of cells in the source data sheet that has no pivot table, and this took almost no time at all.

Is there anything else I can try? As it stands, it is unfortunately too slow for me to use and I’ll have to find ways to produce the file without using a pivot table, which impacts other areas of our software.

Hi Bob,


Thank you for writing back, and sorry to know the performance isn’t still up to your expectation.

In order to further investigate the matter, and to find some workaround for the said situation, we have logged an investigative ticket (CELLSNET-42688) in our bug tracking system. Please spare us little time to properly analyze the problem cause, and to provide a fix (if applicable). In the meanwhile, we will keep you posted with updates in this regard.

The machine on which we performed the testing is a Core i7, but each core is just 1.7 GHz. Moreover, its burdened with testing platforms such as virtual machines that also share the processing cycles as well as the memory. These could be the reasons that the execution time on our end is far more than yours. Anyway, before we made any modifications to the source code for improvements, we bench-marked the execution, so we could know if source code change has brought any improvements or made it worse.

The issues you have found earlier (filed as CELLSNET-42688) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.