Update Fields & Convert DOCX Word Document to PDF using C# .NET Core | Performance Improvements & Memory Optimization

We recently converted one of our .net framework application to .net core and observed a significant performance decrease also it is using lot of memory. We’re using Aspose.Word 19.10.0. I used multiple profilers to profile this application like JetBrains dotTrace and here are some comparisons:

Memory Consumption for generating 175 Documents
.net framework application consumes 167 MB
.net core application consumes 1.1 GB

Performance for generating 175 Documents
.net framework application took 53 mins
.net core application took 1 hour and 15 mins

All the application profilers points to Aspose.Word in one way or another.

  1. It looks like Aspose.Word is using RegularExpressions.RegexRunner and it consumes lot of memory also it is one of the reasons of slow performance. (Screenshot attached). is it a known issue ? is there any workaround for it ? Aspose_Regex_Performance_1.JPG (71.8 KB)

  2. We’re calling “UpdatePageLayout” method as it is recommended by the aspose. This method alone took 28.7% of the overall processing. Is it a known issue? is there any work around for it ?
    Aspose_UpdateLayout.JPG (8.5 KB)

@imran.khan1,

When you call the Document.UpdatePageLayout Method then the process has to spend some more time to be able to re-layout the pages in Document. Roughly, Aspose.Words layouts 10 pages per second; so, the extra amount of time Aspose.Words may take to format a document into pages depends on the number of pages your Word document has. Please note that this process is not linear. It may take a minute to layout elements on one page and may take a few seconds to process 100 pages. Because, it also depends on the document’s complexity.

However, we need to test your particular scenario on our end with the latest (20.6) version of Aspose.Words for .NET. Pease ZIP and attach the following resources here for testing:

  • Your simplified Word document (if any) and any other resources required to successfully replicate this issue on our end
  • Please create a standalone simple Console application (source code without compilation errors) that helps us to reproduce these problems on our end and attach it here for testing. Please do not include Aspose.Words DLL files in it to reduce the file size.

This will help us to understand as to why you are seeing decreased performance and increased memory consumption on your end. And we will then be in a better position to address your concerns accordingly.

But those are the same documents, we used to process in .net framework app. Why there is significant difference in performance and memory consumption. The code is simple

// After applying some Modifications to the docx file

Document doc = new Document(MyDir + "Rendering.docx");

doc.UpdatePageLayout();
doc.UpdateFields()

doc.Save(ArtifactsDir + "Rendering.UpdatePageLayout.2.pdf");

Here is the document:
TCS8TCS0.zip (26.8 KB)

@imran.khan1,

You are right; your code takes more time and memory when running over .NET Core. For the sake of any correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-20577. We will further look into the details of this problem and will keep you updated on the status of the linked issue. We apologize for your inconvenience.

Do you know anything we can do to improve the performance? any work around ?

@imran.khan1,

Your issue is currently pending for analysis and is in the queue. There are no workarounds available at the moment. Once the analysis of this issue is completed and the root cause is determined, we may then be able to provide you a workaround. We apologize for your inconvenience.

Looks like this started happen after we upgrade to a newer version of Aspose.Word. Also based on our findings, it doesn’t just in .net core application, there is a performance decrease in .net framework app as well. I will keep you posted. I hope it will help in your analysis.

@imran.khan1,

Please also provide information about the exact Aspose.Words’ old version number for which there were no problems on your side previously.

We have done some research and here are our findings

  • We have tried different Aspose versions (20.1, 19.10 and 19.8) with our .Net Core application. Looks like the performance is not good as .Net Framework application. Just want to mention, both applications are exactly same The only difference is, one is built using .net core and other .net framework. This will have a great impact on our release and we won’t be able to release our .net core application.

  • In .net framework application, we have also noticed a significant performance decrease when upgraded to a newer version of Aspose. We tried different Aspose versions ( 19.10, 19.8, 19.1 ) to figure out what version caused the problem and are here is the comparison:

Aspose Word Version > Time to process 500 documents
19.10 took 14 mins
19.8 took 14 mins
19.1.0 took 6 mins

Are you aware of this?

@imran.khan1,

Please see this C# Code.zip (492 Bytes) and when we ran it against TCS8TCS0.zip (26.8 KB) document over .NET Framework 4.6.1 using 20.6 and 19.1 versions of Aspose.Words for .NET, we got the following results on our end:

Can you please provide your source document(s) and a console application (source code without compilation errors) that helps us to reproduce this problem on our end? What .NET Framework version (and OS) are you observing this performance issue on?

Code is really simple:

Document doc = new Document(MyDir + "Rendering.docx");

doc.UpdatePageLayout();
doc.UpdateFields()

doc.Save(ArtifactsDir + "Rendering.UpdatePageLayout.2.pdf");

We haven’t tried with 20.6 but I have posted our research with different Aspose versions. As a work around, we have rolled back our application to use 19.1.0. As many clients complaint about the performance issue.

@imran.khan1,

We had released a new 20.7 version of Aspose.Words for .NET a few days ago; can you please try the latest version and see how it goes on your end? In case the problem still remains, please provide your source document(s) and a console application (source code without compilation errors) that helps us to reproduce this problem on our end. What .NET Framework version (and OS) are you observing this performance issue on?

Regarding WORDSNET-20577, this issue is currently pending for analysis and is in the queue. We will inform you via this forum thread as soon as this issue will get resolved in future.

Any update on WORDSNET-20577 ?

@imran.khan1,

Regarding WORDSNET-20577, we have completed the analysis of this issue i.e. we tested the scenario with 19.10 and 20.0 versions under MS Windows netcore 2.2, netcore3.1 and net462 frameworks and have got the following results:

Aspose.Words for .NET 19.10
Net. Framework 4.6.2
Load time: 2349 ms
UpdatePageLayout time: 3194 ms
UpdateFields time: 1287 ms
Save time: 596 ms
Peak memory: 85.23438 Mb
NetCore 2.2
Load time: 2594 ms
UpdatePageLayout time: 3507 ms
UpdateFields time: 1378 ms
Save time: 637 ms
Peak memory: 82.50391 Mb
NetCore 3.1
Load time: 2123 ms
UpdatePageLayout time: 2880 ms
UpdateFields time: 1179 ms
Save time: 463 ms
Peak memory: 89.37109 Mb


Aspose.Words for .NET 20.8
Net. Framework 4.6.2
Load time: 2423 ms
UpdatePageLayout time: 3349 ms
UpdateFields time: 1267 ms
Save time: 617 ms
Peak memory: 86.65234 Mb
NetCore 2.2
Load time: 2572 ms
UpdatePageLayout time: 3496 ms
UpdateFields time: 1419 ms
Save time: 646 ms
Peak memory: 82.28906 Mb
NetCore 3.1
Load time: 2102 ms
UpdatePageLayout time: 2885 ms
UpdateFields time: 1195 ms
Save time: 464 ms
Peak memory: 88.94531 Mb

In our tests, we didn’t find any essential memory usage against .NET Core. Also we didn’t find any bottleneck in Aspose.Words’ code. Aspose.Words for .NET under .NET Core is a bit slower than under .NET 4.6.2 but is a bit faster under .NET Core 3.1. We can recommend to run your code under .NET Core 3.1.

Couple of questions :

  • This is for generating 1 document ?
  • Can you share the document you used ? Docx and PDF both.

We were running our code under .net core 3.1. We haven’t tried newer Aspose versions, we will check those and let you know.

Previously you mentioned that the code was taking more time and memory when running over .NET Core and that is why you created a ticket. So have you guys resolved it in the newer version? How come you don’t see that issue any more ?

@imran.khan1,

We have logged these details in our issue tracking system and will keep you posted here on further updates.

so did you get the details ?

@imran.khan1,

On the .NET Core 2.2 framework, the following code takes 186 seconds (Peak memory: 102.7695 Mb) and 10 seconds (Peak memory: 101.6211 Mb) respectively when running with 19.10 and 20.11 versions of Aspose.Words for .NET on our end. Can you please check this code on your system with 19.10 and 20.11 versions as well and share your findings?

var total = Stopwatch.StartNew();
for (var i = 0; i < 100; i++)
{
    Document doc = new Document("C:\\Temp\\TCS8TCS0.docx");
    doc.UpdatePageLayout();
    doc.UpdateFields();
    var outFile = "C:\\Temp\\19.10.pdf";
    if (File.Exists(outFile))
        File.Delete(outFile);
    doc.Save(outFile);
    Console.WriteLine($"{i + 1} iteration done.");
}
total.Stop();
Console.WriteLine(string.Format("Total time {0} s", total.ElapsedMilliseconds / 1000));
Console.WriteLine(string.Format("Peak memory: {0} Mb", Process.GetCurrentProcess().PeakWorkingSet64 / (1024f * 1024f)));

Hi,

I didn’t get a chance to run your code. but we have tried using latest version of Aspose and here are our findings:

.Net Framework ( Aspose Word 19.1 ):
It took 2 min 30 sec to generate 200 docs. It consumed 90 MB in memory.

.Net Core 3.1 ( Aspose Word 20.8 ):
It took 3 min 50 sec to generate 200 docs. It consumed 200 MB in memory.

.Net Core 3.1 ( Aspose Word 20.11 ):
It took 2 min 16 sec to generate 200 docs. It consumed 610 MB in memory.

So performance wise 20.11 is better ( documents per second ) but memory consumption is unacceptable.

@trizetto,

We have logged these details in our issue tracking system and will keep you posted here on further updates.