OutOfMemory with updateFields and other operations

Hey all,

the issues reported here were discovered while investigating the workaround suggested on this topic low performance when save document to PDF format through Aspose Word Java library

Using updateFields, getPageCount or saving a PDF/XPS (regardless if only the 1st page are all pages) for a document larger than 3000 pages will cause an OOM exception on 32bit JRE with 1Gb of heap available. The test code is attached if needed.

From the discussions on the other thread I can assume the problem is caused by Aspose creating the APS (Aspose Page Specification)
model in memory. And while I understand why this is happening and the technical challenges, this is a serious limitation of the updateFields functionality in addition to the PDF save.

Regards,
Dragos

Hi Dragos,
Thanks for your request. I managed to reproduce the problem. This is the same problem as you reported in your other thread.
When you call UpdateFields method, it internally calls UpdatePageLayout to build page layout. This is needed to update fields that use page numbers (TOC for example). And here we back to time and memory consuming operation of layout document into pages.
I would not recommend you to generate such large documents. If it is acceptable, better to generate few small documents.
Best regards,

Hey Alexey,

generating several smaller documents ( PDF is my target) is what I was trying. That plus your colleague’s idea to use PDF.Kit might have solved the predicament we are in. But I need a TOC in PDF hence the call to updateFields.

Please keep in mind all my tests were done using plain text. I will expand these tests to use images and I’m afraid this number will further decrease.

So at this time generating a PDF that contians a TOC from from Aspose.Words is limited to 3000 pages.

Regards,
Drgaos

Hi
Thank you for additional information. What kind of documents do you generate, that they should be so huge?
Maybe in your case, if your goal is to generate PDF documents, it would be better to use Aspose.Pdf.
Best regards,

Hey Alexey,

the crash occurs when updateFields happen, before any save operation. Leaving aside the save to PDF functionality the updateFields functionality is limited to documents below 3k pages, regardless of save format.

With regard to using Aspose.PDF - this is what we are currently using but we wanted to move to Aspose.Words 10 so that we can benefit from building a single model from which we generate multiple formats.

Regards,
Dragos

Hi Dragos,
Thank you for additional information. Unfortunately, the only way to generate such large documents I can suggest you at the moment is increasing a heap.
Best regards,

Hey Alexey,

increasing the heap and moving to 64 bit would help to a certain degree but we would be treating the effect not the cause. The question is if you guys admit this as being a defect/limitation of your product and have any concrete plans to fix it. I’ve seen statements that you are continuously working to improve quality and while I appreciate them we cannot use them to plan our next moves.

Regards,
Dragos

Hi Dragos,
Thanks for your request. I logged your request into our defect database. We will investigate whether there is a way to reduce memory usage. We will let you know once we complete the analysis.
Best regards,

Thanks Alexey.

Regards,
Dragos

Hi Dragos,

3k pages document is a bit too big for Aspose.Words at the moment. We will eventually support such large documents, but we classify them as “unusually large” and support for “unusually large documents” at the moment is not on the high priority list.