Out of Memory Exception While Merging Workbooks

Hi,

We have a business use case where in we are trying to merge individual workbooks [which actually contain one single sheet] into one big work book and send out the resultant workbook to the client.

The current flow of the program is:
1. Create the resultant output workbook object
2. For each workbook Item which contains a single sheet in the DB
2a. load it into a temporary workbook
2b. Copy the sheet from temporary workbook to resultant output workbook
3. Save the resultant workbook

While performing this exercise, what we have noticed is that the garbage collection does not happen for intermittent workbooks and hence when we are trying to merge around 400+ sheets, we are getting "Out of memory exception".

The sample code base (with 100 sheet merges) and the Memory Profiling Graph obtained is as attached:

Workbook reportBook = new Workbook();
reportBook.Worksheets.Clear();
Aspose.Cells.License license = new Aspose.Cells.License();
license.SetLicense(System.IO.File.OpenRead("Aspose.Cells.lic"));

byte[] data = new byte[0]; PLEASE PERFORM A READ OF FILE TO LOAD THE BYTE ARRAY

for (int i = 0; i < 100; i++)
{
Console.WriteLine("Processing Sheet : {0}... GC Memory : {1}", i, (GC.GetTotalMemory(false) / 1024));
using (MemoryStream stream = new MemoryStream(data))
{
Aspose.Cells.Workbook wBook = new Aspose.Cells.Workbook(stream, new Aspose.Cells.LoadOptions(LoadFormat.Xlsx));
reportBook.Worksheets.Add("Sheet" + i.ToString());
reportBook.Worksheets["Sheet" + i.ToString()].Copy(wBook.Worksheets[0]);
wBook = null;
}
}


Considerations:
* Each sheet is having around 500 Rows and 40 to 50 Columns
* The entire consolidated work book of 400+ sheets has to be delivered to the client at once [I mean, we have already proposed split and use, but it has not worked]

I would really appreciate if you would work on this on priority and let us know if we do have any solutions, or is there any other better way that we can reduce the memory foot print but still achieve the desired functionality.

Thanks & Regards,
Anil
Hi,

Well, for your case when you merge hundreds of big files (each having long list of data in a single sheet with more columns) the output file would be huge sized, so it would require more resources (RAM etc.) to save such a file. Aspose.Cells would require 10 times or more RAM of the size of the final file, this is not final as it may depend on the data, formattings, formulas or other drawing or binary objects' types and its complexity etc. that may vary a bit, so make sure that you have sufficient resources to accomplish this big task. Also, when you merge all the sheets contents into a single sheet, this process would also require more resources and it would take some time to be completed . Even if you could save such a file, when you will open it in MS Excel, MS Excel too would take some time to open into it.

For your OutOfMemory issue, you may try to enhance the resources or preferably try it on 64-bit machines with enhanced configurations (e.g third generation processor).

By the way, could you use Workbook.Combine() method instead of copying individual worksheets from in the workbooks, see the document for your reference:
http://www.aspose.com/docs/display/cellsnet/Combine+Multiple+Workbooks+into+a+Single+Workbook


We are also doing some more enhancements regarding memory management when loading huge files, so we might not have any reliable workaround to cope with it on normal configured systems.

Thank you.

Hi,


The peak point while doing memory analysis of merging 100 sheets are as indicated below:

Work sheet Copying: 269 MB
Work sheet Combining: 497 MB

So the earlier one is far better than the later :slight_smile:

We are putting in the effort now to make it 64-bit, but still, would be great if we could work towards achieving a Saw-Tooth Wave on Memory Profiling rather than an Up Hill!!!

Let us know if you need any more information.

Thanks & Regards,
Anil


Hi,


Thanks for providing us further details.

Could you create a separate (runnable) console application, zip it and post it here to reproduce the issue on our end. We will check your issue and log a ticket for it into our database (Issue tracking system) and investigate your issue soon.

Thank you.