Using DocBuilder causes huge amounts of memory to be used


#1

I’m using the DocBuilder to insert text sections into a document. I’m basically reading from a text file and writing each line to a document using DocBuilder. My text file is about 2MB and contains about 32,000 lines. My resulting doc file is 4.5MB. The problem is that during the creation of the Word document, memory use raises to an incredible level. This has a knock-on effect of slowing the process down to near unusable levels. I can supply the text file if needed. From there, it’s just a case of…
… read line from text file
… write line to docbuilder
… repeat to end of file

Thanks,

Steve.


#2

Hi Steve,

Thanks, I’ve got the file. As I said elsewhere, Aspose.Word filesize/memory ratio is about 1:10 which is obviously too much for big documents like this.

We will have another optimization by the end of May and report what memory footprint we achieve.

Obviously we can only approach 1:1 ratio, but cannot beat it as long as Aspose.Word keeps all document in memory. It is unlikely that we will change Aspose.Word architecture to keep only portions of file in memory to reduce memory usage even further.

What do you think is reasonable memory usage for your 4.5MB Word file?



#3

Hi Roman,
The 1:10 ratio in this instance is not being acheived, probably due to the fact that I am creating hundreds/thousands of documents in quick succession. The memory issue is probably related to my previous post regarding memory issues when concatenating documents. The memory usage I’m seeing is bringing my dev machine (512MB) to a crawl.

Reasonable memory usage for a 4.5MB file would be as much physical RAM as is available (approx. 218MB on my 512MB machine which is a 1:48.44 ratio). The issue seems to be one of long-lived objects.

I’ll e-mail you personally regarding some solutions.

Many thanks,

Steve


#4

Hi Steve,

Here are some statistics processing your file using current version Aspose.Word 1.4.9:

Input file: text, 35,323 lines, 1.95mb
Test computer: PIII 800Mhz, 512Mb RAM
Test: Loaded all file line by line and added to a document using DocumentBuilder.Writeln, saved to a Word file.

Results:
Memory usage: 138Mb
Time: 65 seconds at 100% CPU

This is 543 lines per second and 4096 bytes per line of about 90 characters. This is 1:45 ratio - similar to your figure.

Aspose.Word 1.4.9 had small performance improvements since we last discussed it and I think they were significant enough to allow me to get this test to complete in just about one minute.

We are now going to spend some time purely on performance improvement and publish new results here.

What results for memory and time do you think are satisfactory for you?


#5

Hi Steve,

Aspose.Word 1.5 is out with the latest performance and memory optimizations.

We’ve only managed to focus on building document from a text file, but the effect of the optimization is throughout the component and not limited to this scenario. We have not yet addressed your other issue with copying and appending sections, but we will address it in another round of optimizations shortly.

The latest statistics with Aspose.Word 1.5:

Input file: text, 35,323 lines, 1.95mb
Test computer: PIII 800Mhz, 512Mb RAM
Test: Loaded all file line by line and added to a document using DocumentBuilder.Writeln, saved to a Word file.

Results:
Memory usage: 68Mb
Time: 7 seconds at 100% CPU

This is 5046 lines per second and 2018 bytes per line of about 90 characters.

So comparing to the previous test we reduced memory usage by the factor of two and made it 10 times faster.