Merging Files in Java causes an 'OutOfMemoryError'

Hello All,

I’m evaluating third party tools for a project in which the customer has rather strict requirements on output files.
The issue is somewhat related to this thread and yet again, completely different.

In short:
I have to generate Excel files (preferably in Excel 2003) which will contain up to 80 columns per row and a maximum of 500.000 rows. Yes, Five Hundred Thousand rows. Insane. I know… don’t mention it… We’re still trying to convince the user to find some other solution.

Anyway, rather fast we were confronted with the limitations on Excel 2003, where a single worksheet can only contain 65.536 rows. So we create a new sheet once we reach that number.

So far we’ve been using Jakarta POI to generate the file. On my local machine (Lenovo T500 with 4GB or RAM and using Windows XP) I’m able to generate a file up to +/- 75.000 rows. (divided into 2 sheets that is)

As we have to find a working solution for Excel 2003, I figured I could try to create multiple files, each with 1 sheet and afterwards merge the different files into one big file.

Creation of the multiple files went rather smoothly, but when I tried the Merge, and this is where Aspose.Cells enters the picture, I keep getting an ‘OutOfMemoryError’.

Now I can perfectly understand that one would say: well that’s normal, who in his right mind wants to join Excel files that each have 65.536 rows in it? But as I’m sure many of you have encountered before, one can’t always explain to a customer the limitations of programming…

What I would like to know is this:

Is there a more memory-efficient way to merge excel files then the one described in the documentation.
At this point, I don’t really care how much time it’ll take.

regards,
Pieter

Hi,

Well, since you are creating a big file, so, it will demand more memory. For the "OutOfMemory Exception", you have to specify sufficient memory for JVM. Kindly make sure that you have provided and extended the JVM memory appropriately to run your code for creating the big file. So, kindly try to extend the memory appropriately.

Following is the command line if you can make use of it:

java -Xms1000m -Xmx1000m MyTest ......... you may allot more memory appropriately.

Thank you.

Hey Amjad,

thanks for a very fast reply!

Seems I forgot to include this, rather important, part of information :slight_smile:

I start my JVM with the following argument: "-Xmx1600M"

I’m even convinced that for my particular setup (with my laptop and all) this is the maximum amount of memory I can assign.

on the side: my working hours here in Belgium are over now so I’ll probably won’t reply to or read this thread before tomorrow 8am CET.

regards,
Pieter

To server your need, the most efficient way is not merging serveral files into one, but creating a single multi-worksheets Excel file and populating data into it one sheet after one sheet. For example, you can create a file with 2 worksheet, populate 50K rows data into the first sheet, and populate 50K rows data into the second sheet. In this way, I think you can create larger Excel files.

Hey Laurence,

thanks for your reply!

What you say is 100% correct. And we did try that at some point in time.
However (yes, you did expect the little ‘however’ there, didn’t you? :slight_smile: )

Whenever I create a single Excel File, again: using 80 columns per row, with about 80.000 rows I get an OutOfMemoryError from within the JakartaPoi part of my application.

I fill out the first sheet with the 65536 rows, then continue onto the next sheet and poof, my JVM crashes due to not having enough memory.

Hence our solution (for now) to create multiple files, one sheet each, and afterwards merge the different files into one large® file.


But I’m beginning to think this solution might not work…

Any other ideas?

regards,
Pieter

Hi,

Thank you for considering Aspose.

I tested the performance of Aspose.Cells for java as per your scenario and I was able to generate an XLS file about 205MB size with around 160k Rows and 80 Columns (3 worksheets) with the following test code.

Sample Code:

Workbook workbook = new Workbook();

Worksheets worksheets = workbook.getWorksheets();

Worksheet worksheet = worksheets.getSheet(0);

for (int i = 0; i <= 65000; i++)

{

Cells cells = worksheet.getCells();

for (int j = 0; j <= 80; j++)

{

cells.getCell(i,j).setValue(“Aspose.Cells”);

}

}

Worksheet worksheet1 = worksheets.addSheet();

for (int i = 0; i <= 65000; i++)

{

Cells cells1 = worksheet1.getCells();

for (int j = 0; j <= 80; j++)

{

cells1.getCell(i,j).setValue(“Aspose.Cells”);

}

}

Worksheet worksheet2 = worksheets.addSheet();

for (int i = 0; i <= 30000; i++)

{

Cells cells2 = worksheet2.getCells();

for (int j = 0; j <= 80; j++)

{

cells2.getCell(i,j).setValue(“Aspose.Cells”);

}

}

workbook.save(“C:\result.xls”);

Please execute this code and share your finding with us. Also, please share your code which you are using to test your scenario. Following are my system specs.

1.7 GHz processor.

1 GB RAM

Windows XP SP2.

Thank You & Best Regards,

Hey Nausherwan,

Just letting you know that I tested your code (with some minor modifications to represent a more accurate representation of our code) and it works perfectly!

I did need to add some garbage collection code, but other then that, it seems to do the trick.

Thanks a million for your help!

Now all I need is an ‘ok’ from out customer :slight_smile:

Thanks to all that contributed to this thread!

Pieter