Performance of ASPOSE

Hi,

I am new to ASPOSE, currently I am working using JExcelAPI with large docuements but its performance is bit slow and more memory consumer. Just with 8.96 MB of Excel file, using JExcel API consumes 64MB heap size. I need comparision between JExcelAPI and ASPOSE in terms of.

1- Loading mechanism of file (full file is loaded in memory or just provided chunk is loaded)
2- Memory consumes while reading and writing
3- Performace
4- Supported Ms Excel versions
5- Cost to purchase ASPOSE

any other information thing will also be welcome.

Thanks
Abrar Hussain

Dear Abrar,

Can you zip and post your this 8.96MB file here or send it to nanjing@aspose.com? We will make a comparison for 2 and 3 and give you the result.

For other questions:

1- Loading mechanism of file (full file is loaded in memory or just provided chunk is loaded)
Generally Aspose.Cells will load a full file into memory. And it also provide options to read data cell by cell.

4- Supported Ms Excel versions
Excel97, 2000, 2002, 2003 and 2007.

5- Cost to purchase ASPOSE
You can figure out the price at `http://www.aspose.com/Purchase/Aspose.Cells/`. If you have another other questions about purchasing, please post them at http://www.aspose.com/community/forums/aspose.purchase/220/showforum.aspx . Our sales team will support you soon. Thank you.

Hi,

Thanks for your reply.
I am bit confuse that loading mechanism is same in JExcelAPI and Aspose, full file is loaded into memory that’s what I don’t want.

Give me statistices if I give full sheet filled with all columns and all rows in (256 * 65535), then how much memory Aspose will consume to process it.

Thanks
Abrar Hussain

Hi Abrar Hussain,

Thanks for considering Aspose. The latest version of Aspose.Cells for Java provides three modes to process Excel file:
1.Workbook.open(): read all data and Objects of the workbook, including all cells, formatting, Objects such as Shape, Chart, …, and so on..
Please use Workbook.open(string,FileFormatType) method if you know the file format type,it will save memory and time in running.
2.Workbook.loadData(): read in only data and formatting of cells in the workbook, ignores all other objects such as Shape, Chart, …, and so on. It is better suited to the situation when the template file contains objects but user only need to access cells data and formatting.
3.LightCells.processWorkbook(), LightCells.processWorksheet(): this is new API in the newest version of Aspose.Cells for Java, it processes only cells data of workbook in Event-Driven mode. When user request process the workbook or worksheet, Aspose.Cells will parse cells one by one and call user defined action on every cell data. It is better suited to the situation when the template file mainly contains huge data set and user only need to access cells one by one sequentially. Currently it only support cells with plain data, whose data type can be int, double, boolean, String and datetime. Advanced features such as formula, hyperlink, etc. are not supported.
Following statistics shows the time and memory cost when open and access all cells of the template file you sent:

Workbook.open:
Cells processed: 531291
time: 1875
before gc, memory used/total: 30752296/77524992
after gc, memory used/total: 30110720/77524992


Workbook.loadData:
Cells processed: 531291
time: 1390
before gc, memory used/total: 36609096/55615488
after gc, memory used/total: 26810328/55615488


Event-Driven mode by LightCells:
Cells processed: 474843
time: 625
before gc, memory used/total: 20340128/27398144
after gc, memory used/total: 9916416/28119040


JExcelApi:
Cells processed: 726671
time: 2000
before gc, memory used/total: 23006864/90951680
after gc, memory used/total: 19092328/68132864

Hi Warren,

Thanks for such an instant response. This information is really helpful for us and we are considering ASPOSE as a very valuable and helpful source for us especially event-driven mode by LightCells. With reference to your response, we have following few more queries that require your further assistance:

We are considering following stats:
Event-Driven mode by LightCells:
Cells processed: 474843
time: 625
before gc, memory used/total: 20340128/27398144
after gc, memory used/total: 9916416/28119040

1. What is the time factor shown in stats? Is it in milliseconds or seconds?
2. Are you considering memory in KBs or MBs?
3. What would be condition of memory if we process file with 6080000 cells (almost 64000 rows * 100 columns)? According to our understanding it should show the same behavior as we are using event driven model.

Thanks,
Abrar Hussain

Hi Abrar,

Thank you for considering Aspose, for your questions:

1. What is the time factor shown in stats? Is it in milliseconds or seconds?

the time is in milliseconds.


2. Are you considering memory in KBs or MBs?

the memory is in Bytes.


3. What would be condition of memory if we process file with 6080000 cells (almost 64000 rows * 100 columns)?

In fact, there are some global data shared by all sheets, such as String values, styles that denote value type, which we have to parse initially and retain in memory for later parsing cells. And for considering performance, we retain the binary data of cells in memory too. So, the memory cost will increases with the size of dataset in file. And another limit of LightCells is it does not support modify and re-save the workbook file. Following is my test for a Excel file contains large dataset:

Cells count processed: 6898737
time: 16187
before gc, memory used/total: 128436416/350392320
after gc, memory used/total: 127066200/350392320

By JExcelApi, with JVM parameter -Xmx900m it throws OutOfMemoryError.

Because the memory cost depends on not only the size of file and count of cells, but also on such as shared String values count, binary data of sheet, and so on, so I think you can test the performance of LightCells with your template file. Following code maybe help you on using LightCells API:

...
LightCells lightCells = LightCells.getInstance("t.xls", FileFormatType.EXCEL2003);
CellHandler handler = new CellHandlerSample();

lightCells.processWorkbook(handler);//lightCells.processWorksheet(0, handler);
...

class CellHandlerSample implements CellHandler
{
private int _sheetIndex;


/**start with a worksheet, check whether process it. If this handler need to refer to sheetIndex later in startCell() or process() method, that is, if the process need to know which worksheet is being processed, the implementation should retain the sheetIndex value here. */
public boolean startSheet(int sheetIndex)
{
_sheetIndex = sheetIndex;
return true;
}


/**start with a cell, check whether process it*/
public boolean startCell(int row, int column)
{
return true;
}


/**process current cell.
Note: In consideration of performance, the LightCell object maybe reused for all cells, so if it is needed to keep current cell's data for later use, please call LightCell.copy() and keep the cloned one*/
public void process(LightCell cell)
{
int row = cell.getRowIndex();
int col = cell.getColumnIndex();
int type = cell.getValueType();
Object val = cell.getValue();
System.out.println(_sheetIndex+"-"+row+"&"+col+":"+type+", "+val);
}
}

Hi,

Thanks for using Aspose.Cells.

For detailed information regarding LightCells API(s) and its interface, please see the following documentation topics.