Hello,
My name is Klovatch Alexander.
I’m HP employe and currently I’m working on Aspose evaluation for our Reporting system.
My team leader already asked here several question and now its my turn to ask.
During the evaluation, I measured Document object size on heap and noticed that after its saving in different types, the object’s size grows differently for each type.
E.g. a Document object with size of 239458KB, grows to 270813KB (13%) on saving it to docx and grows to 310893KB (~30%) on saving it to html.
I have two question regarding the described above:
Why does the Document’s heap size grow on saving?
Why does the Document’s heap size grow differently when saving the document in different types?
Hello Alexander!
Thank you for your interest in Aspose.Words.
First of all, what do you mean under Document object size on heap? You can measure only overall heap load but not the size of particular objects. In .NET Framework objects don’t own each other physically but only refer each other. So you cannot state that some objects belong to the Document object and others don’t. Almost every operation on a document leads to allocation of new objects. That’s why memory usage grows.
When you are using the library in different cases memory consumption also differs. For instance if you are loading your document and then saving in different formats that’s true. This is not curious. You can estimate memory usage for your particular scenarios and documents. Maybe it would be helpful to invoke garbage collection at some moments in your application.
Please let us know if we can help further. Your feedback is very important for our business.
Regards,
It is true that a Document object can grow in size in memory after you save it. Or after you work with it in other ways.
The reason is that many nodes in a document tree have “lazy init” or “cache” fields. That is when a Document is first loaded from a file into memory, most of the “lazy init” fields are nulls. Those fields are not needed during document load operation for example.
Then, when you invoke Document.Save, the save operation accesses some of properties in the document nodes and they get initialized. The size of the document tree (what you see in the profiler) grows because now there are more managed objects on the heap that can be reached from the Document object.
Examples or “lazy init” fields include Run.Font or Paragraph.ParagraphFormat for example. Also typed collections such as Document.Sections, Body.Paragraphs are also lazy init fields.
Saving to different formats might require access to only some properties, but not others. For example, saving to DOCX requires access to fewer lazy init properties because DOCX is a Microsoft Word format (that is more native to Aspose.Words than HTML).
There is nothing wrong with this behaviour. The most important thing is that saving such document to the same format again (after it was saved once) will not grow the heap size more because all required lazy init properties were initialized.
After you stop using the Document object and when it is garbage collected - it will be collected with all nodes and all other internal objects it holds.
I hope that answers your question.
Thank you for so quick response, Roman!
In addition, I noticed that the HTML report file size is much larger and saving time much longer when saving a Document in html format (please see the attached xls file and template2.doc template used for a single reported entity - the final report is concatenation of X single entity reports).
Could you please comment the described behavior?
Thanks,
Alex K.
Thanks for your inquiry. As I told you in another thread you created, size of a document is dependent from amount of content and complexity of the document. Regarding HTML format, you can try using embedded CSS to improve performance and decrease size of the output document. See the following code:
Hi Alexey,
The provided solution indeed decrease the file size by about 30%.
Could you please provide some comparisons regarding Aspose.Words capabilities and benchmarking vs. its market competitors?
Thanks,
Alex K.
Thanks for your request. Unfortunately, we do not have any particular performance testing results or comparison with our competitors. However, you are free to evaluate Aspose.Words to your satisfaction and compare it with other products.
Best regards.