Performance issues for huge data

Hi,


Greetings!

I would like to know how sorting using Aspose.cells would affect performance when huge amount of data is encountered.

My main objective is to sort an excel file which has been created and saved by an application using Aspose.cells. I added few lines of code to the application to sort the saved file using the DataSorter class of Aspose.Cells

So, for sorting I added the file in Workbook class object(Workbook wb=new Workbook("@<filepath>\excelFile.xls").
How much will this affect the performance of the application when huge data is encountered and what is the maximum size of the file that it can hold. Is this something you can help?I am in the process of evaluating an implementation of what I did. It would be great if you can provide a quick answer to my query.

Thanks in advance!

Hi,

Thanks for your posting and using Aspose.Cells.

Sorting itself is a huge operation and can take lot of time. But I believe sorting in XLSX format will give better performance than in XLS format. Also XLSX format supports millions of rows while XLS format supports thousands of rows.

Also, sorting will depend on how many rows and columns you want to sort. We can create a sample Excel file manually with maximum number of rows and some columns like 10-15 columns and then sort them using Aspose.Cells API and note the time.

You can also provide us your excel file with your sample data and we can find out the time taken by Aspose.Cells for sorting operation.

Besides sorting is CPU and Memory Intensive operation so this also depends on the power of CPU and System RAM. However, Aspose.Cells itself does not pose any limitation on the size of excel file and sorting data.

Thanks for your response.


My computer system configuration is
Processor : Intel Core 2 Duo 2.93Ghz
RAM: 8.00GB

Due to some constraints the data cannot be shared. Considering the above configuration, as you had said about the sample excel file with 10-15 columns and maximum rows could you please provide me the time taken to sort an excel file by a column using Aspose.

Thanks in advance!




Hi,

Thanks for your posting and using Aspose.Cells.

I have tested this issue with the following sample code using the latest version: Aspose.Cells
for .NET v8.5.1.5
and the entire code took less than 1 minute to execute and sorting itself took 23 seconds.

First I created the source excel file (which I will attach later for your reference along with output excel file) manually using Microsoft Excel 2010 with random numbers in a range A1:J1048576. It means, max rows of XLSX format and 10 columns. The source excel file is 53.8 MB and the output excel file generated after sorting operation is 35.8 MB.

The sorting code sorts with two keys, first key sorts with respect to column A in Descending order and the second sort key sorts with respect to column B in Ascending order.

I have also shown the Debug (Console) output of the code for your reference. You can try the same code with your own excel file and note down the time.

My System was same as yours except that my CPU is 2.8 GHz but RAM is 8GB and I tested it on Windows 7 - 64 bit.

Please note the time is in millisecond so 33295 means 33 seconds.

C#


Stopwatch sw = new Stopwatch();

sw.Start();


Workbook workbook = new Workbook(“source.xlsx”);


sw.Stop();


Debug.WriteLine(“Load Time:” + sw.ElapsedMilliseconds);

Console.WriteLine(“Load Time:” + sw.ElapsedMilliseconds);


Worksheet worksheet = workbook.Worksheets[0];


//Get the workbook datasorter object.

DataSorter sorter = workbook.DataSorter;


//Set the first order for datasorter object.

sorter.Order1 = Aspose.Cells.SortOrder.Descending;


//Define the first key.

sorter.Key1 = 0;


//Set the second order for datasorter object.

sorter.Order2 = Aspose.Cells.SortOrder.Ascending;


//Define the second key.

sorter.Key2 = 1;


//Create a cells area (range).

CellArea ca = CellArea.CreateCellArea(“A1”, “J1048576”);


sw = new Stopwatch();

sw.Start();


//Sort data in the specified data range (A1:B14)

sorter.Sort(workbook.Worksheets[0].Cells, ca);


sw.Stop();


Debug.WriteLine(“Sort Time:” + sw.ElapsedMilliseconds);

Console.WriteLine(“Sort Time:” + sw.ElapsedMilliseconds);


workbook.Save(“output.xlsx”);


Console.WriteLine(“Press any key to continue…”);

Console.ReadKey();


Debug (Console) Output:
Load Time:33295
Sort Time:23000

Hi,

Thanks for using Aspose.Cells.

In reference to above post, I have attached the source excel file used in the code and also attached the output excel file generated by the code for your reference.

Source Excel File:
( Dropbox - Error - Simplify your life )

Output Excel File:
( Dropbox - Error - Simplify your life )