Decreasing performance for newer Aspose.Words versions

We have for a number of years been stuck on Aspose.Words 17.7 do to performance issues on newer versions. Due to legistration issues we are now forced to move on within the next year. We have made several trials over the last few years (a previous colleage posted this in 2020) and I have my self performed an extensive analysis last year. Generally we see a merge times increasing by around 40% for the newer versions of Aspose.

Our tests have been done by making minimal changes to our code base in order to facilitate the Aspose.words version bump and on the exact same data. However, we are interested in knowing what we can do to increase merge performance. As can be seen from the 2020 post by me colleage, we are forced to work with long templates with a lot of repeated sections. This is, however, not within our control so it is not a parameter we can work with. Instead we would like to know if there are any optimization parameters we can use or if we can disable some functionality we are not using in order to step up performance.

Currently we perform the merge as follows
execute(String[], String[]) with MailMergeCleanupOptions NONE
executeRegionsmerge(ImailMergeDataSourceRoot) with MailMergeCleanupOptions NONE and MergeDuplicateRegions = true

After this we look for any unmerged fields / regions and merge these with a dummy values (this is done as part of mail validation where unresolved values are highlighted and require manual treatment). This is done with all REMOVE_* MailMergeCleanupOptions activated

Hope you can help us

  • netrvs

@netrvs Could you please attach your template, sample data and code that will allow us to reproduce and analyze the issue? We will check the issue and provide you more information.

We have discussed this internally, and we can not allow merge templates and source code to be freely available on a forum, however, we can send it directly somewhere for analysis.

@netrvs It is safe to attach files in the forum, only you and Aspose staff can download your files.
Also, you can send the documents, code and data via private message. Just click my login and then press “Message” button.

Ok, I will attach the files then. (67.9 KB)

@netrvs Thank you for additional information. It is not quite clear from your code what is your test scenario. If possible, could you please create a simple console application that will alow us to test your scenario on our side.
I have looked through your code and noticed few things that might affect performance:

  1. AsposeDocumentWrapper.saveAsPdf method, you call the following two methods at the beginning:

If you save the output PDF after executing mail merge it is not required to call Document.updateFields because fields are already updated while executing mail merge.
Calling Document.updatePageLayout method is not required as well, since if document layout is not built yet, it will be automatically build while saving to PDF.

  1. AsposeDocumentWrapper.saveAsPdf calls private save method, in which you again call the following:

So upon saving document to PDF you at least twice rebuild page layout and update fields.

  1. private SaveOutputParameters save(OutputStream out, OoxmlSaveOptions options) method from the same class. This method saves the document in DOCX format, which is flow, so you do not need to call Document.updatePageLayout since page layout is not required to save the document to DOCX. But calling this method might significantly increase processing time, since this is quite memory and CPU consuming operation.

Thanks for the reply. I will look into making a stand alone application. Meanwhile, our application first generates a docx-file for the end user to review and edit if needed and then generates the final pdf based on the reviewed document, which is why we update the layout several times. However, from what I read in your comments, this is not necessarry (at least in point 2 and possibly point 3)?

@netrvs No, it is definitely is not required to call Document.updatePageLayout several times if you do not make any changes in the document programmatically between these calls.
Also, if your end user edit the document in MS Word it is not required to call Document.updateFields method as well. In most cases it is required to call Document.updateFields if you make changes to the document programmatically but MS Word usually applies the changes immediately.