Aspose.Words memory leak while processing docx document with visio objects

After processing docx document containing vsdx object allocated memory was not collected.
It seems that something wrong with SKAbstractManagedStream while saving converted document.

image.png (35.1 KB)
doc.zip (197.4 KB)

static void ProcessDocument()
{
  Document doc = new Document(@"D:\doc.docx");
  HtmlSaveOptions saveOptions = new HtmlSaveOptions
  {
    ImagesFolderAlias = "images",
    MemoryOptimization = true,
    ExportHeadersFootersMode = ExportHeadersFootersMode.None,
    ExportPageMargins = true,
    ExportPageSetup = true,
    ExportRelativeFontSize = true,
  };

  doc.Save(@"D:\trash.html", saveOptions);
}

static void Main(string[] args)
{
  for (int i = 0; i < 50; i++)
  {
    ProcessDocument();
    GC.Collect();
  }
}

@directum

We have tested the scenario and have not faced the shared issue. Please note that when the document is closed, all the DOM data is purged from memory during the next garbage collector cycle. The memory may not be released until you close the application.

Moreover, Aspose.Words has some internal static objects which remain live after GC.Collect() because these static fields are GC roots.

Attached code completely reproduces the bug. As you can see there is no DOM objects references. But when we convert the same document several times, memory does not release, because this objects accumulates in memory image.png (9.6 KB).
We are using Aspose.Words 19.10.0 from Nuget.
Why are these large objects GC Roots?

@directum

We have tested again the same scenario using the same document and have not faced the shared issue. After execution of for loop, the memory is released.

Please note that this is not a bug. As shared in my previous post, when the document is closed, all the DOM data is purged from memory during the next garbage collector cycle.

Moreover, the memory may not be released until you close the application.

We have tested this code both using NetFramework 4.6 and NetCore 2.2 and find out that leak only appears on NetCore.
net framework 4.6.png (26.3 KB)
net core 2.2.png (23.1 KB)

@directum

Could you please share your working environment along with complete steps that you are using to reproduce this issue at our end? We will investigate the issue and provide you more information on it.

Windows 8.1

  1. Open Visual Stuio 2017
  2. Create netcoreapp2.2 console application
  3. Add nuget package Aspose.Words Version=“19.10.0” to the project
  4. Type the following code in Program.cs
static void ProcessDocument()
{
  Document doc = new Document(@"D:\doc.docx");
  HtmlSaveOptions saveOptions = new HtmlSaveOptions
  {
    ImagesFolderAlias = "images",
    MemoryOptimization = true,
    ExportHeadersFootersMode = ExportHeadersFootersMode.None,
    ExportPageMargins = true,
    ExportPageSetup = true,
    ExportRelativeFontSize = true,
  };

  doc.Save(@"D:\trash.html", saveOptions);
}

static void Main(string[] args)
{
  for (int i = 0; i < 50; i++)
  {
    ProcessDocument();
    GC.Collect();
  }
}
  1. Compile project
  2. Perform profiling against compiled dll

Profiling result shows that Allocated memory was not collected.
GC roots.png (9.6 KB)
Memory allocation.png (35.1 KB)

@directum

We have tested the scenario using the shared code example with .NET Core 2.2 and have not found the shared issue. The memory is released after closing the application.

We are not able to restart application due to business rules. Our application is long lived.
There is no problem with Aspose.Words net framework dll, otherwise NetCore version has more allocated memory that growing after each document convertation. doc.zip (197.4 KB)
We have problem with NetCore and certain document type - docx word document with vsdx visio insertions.
Could you provide your working environment information?
We bought Aspose license and want you to fix the bug that profiling sessions prove.
netFramework profile.png (26.3 KB)
netCore profile.png (23.1 KB)

@directum
Thank you for additional information. I have managed to reproduce the problem on my side. You are right the memory usage grows continuously if do the conversion in the loop. I have logged the problem in our bug tracking system as WORDSNET-19418. We will let you know once the issue is fixed.

@directum

The problem you have reported is related to a defect in SkiaSharp. We reported the problem to them [BUG] Memory leak when create SKTypeface from file name or stream. · Issue #996 · mono/SkiaSharp · GitHub. Since their release cycle is not predictable, we have implemented a workaround that must cover the most cases. But still the problem persists when custom font sources or embedded fonts are used in the documents (in this case we have to create SKTypeface from file name or stream). The changes we have made will be included into 19.12 release, that will be released in a month.

Thank you. We are looking forward to solving the problem.

@directum We recently release new version of Aspose.Words. This version includes improvements I mentioned earlier. Also, new version uses SkiaSharp 1.68.1, which also has improvements in memory management. Please try using the new version of Aspose.Words and let us know in case of any issues.

Thank you. It really helped us.

The issues you have found earlier (filed as WORDSNET-19418) have been fixed in this Aspose.Words for .NET 20.2 update and this Aspose.Words for Java 20.2 update.

A post was merged into an existing topic: Improve Memory Optimization | Memory Performance | Avoid Memory Leak & OutOfMemory Exceptions | Convert Word to PDF C# .NET