Memory Usage with Aspose.Words

I am using Aspose.Words for its mail merge feature. I am not sure if I am missing any steps but here is what I noticed.

Mail merge template file: MailingLabelsDemo.doc (provided by the demo installation)
Number of contacts per document: 30
Number of contacts to be merged: 78,000
Using the Document.MailMerge.Execute(IMailMergeDataSource) method

First Test (without calling MailMerge.Execute but still loop through the IMailMergeDataSource by calling IMailMergeDataSource.MoveNext and IMailMergeDataSource.GetValue)

aspnet_wp.exe starting memory: 70,000K
when job completes aspnet_wp.exe memory usage is at 98,000K (stuff got loaded onto cache and things like that)
Memory used: 28,000K


Second Test (recycled and now calling MailMerge.Execute to create merged document)
aspnet_wp.exe starting memory: 75,000:
when job completes aspnet_wp.exe memory usage is at 489,000K (stuff got loaded onto cache and things like that)
Memory used: 414,000K
Generated MailingLabelsDemo.doc is about 26MB

Conclusion
414,000 - 28,000 = 386,000K was used to generate a 26MB or 26,000K file. Is this a normal behavior of Aspose.Words? Is there anyway I can reduce the memory footprint?

Thanks in advance,
Kharpoh


I must note that 26MB files with 78K records are quite big among Word documents. Do you really have to have one huge document? If you are concerned about memory, you might get a better result with thousands of smaller documents instead.

In general, yes, a complete document must be held in memory and when a document is loaded into memory in Aspose.Words, it takes several times more size than on the disk. How many times the disk size - depends on the document. In your case it is 14 times, does sound a bit extreme. Maybe you looked at before a .NET garbage collection took place?

Thanks for the reply. That memory usage is before GC is called. I am thinking of breaking the files into x number of word documents based on the current memory or the size of the document.

For example: If the document object has reached y amount of actual memory used or the document has reached z file size, then the IMailMergeDataSource should stop supplying the document with data. I will then close the document and start again from there.

It there an easy way to find out those information? Actual memory used by the document instance or the size of the soon to be generated document.

Thanks in advance,
Kharpoh

Hi,

I think that neither of the above approaches is possible. AFAIK .NET does not allow to measure the size of the object in managed code, you should use a cumbersome interop instead. You could use GC.GetTotalMemory but as you understand this will hardly give you the idea of when to suspend the merge. There is also no way to predict the size of the generated document.

I believe you should better split the document prior to performing the merge as Roman suggested. What do you think?