Memory leak when converting HTML to PDF with external resources (with reprosteps)

When using HTML to PDF convesion with external resources, Aspose.PDF allocates memory which is never deallocated.

Reprosteps are attached. We have a template.html with a logo.svg as an external resource and this template.html is converted to PDF.

There are two variants in Program.cs, see lines 26 and 29:

  • When the logo.svg is sucessfully resolved, Aspose.PDF allocates more and more memory.
  • When the logo.svg is not loaded, Aspose.PDF does not allocate more and more memory.

Unfortunately, I am not able to analyze memory dump due to the obfuscation. I have memory dumps with hundreds megabytes allocated in Aspose.PDF…

ConsoleApp15.zip (6.4 KB)

@kanda

We tested the scenario in our environment while using Aspose.PDF for .NET 21.3 and did not face the issue of memory dumping. The program took total of 140MB of memory and produced a PDF document. outputpdf.pdf (136.0 KB)

Would you please test the scenario at your end with 21.3v and let us know in case you still notice any issue.

@asad.ali

Thank you very much for your aswer!
I understand I lost the first round :wink: and I will do my best to explain it again. Explain it better.
Or I am wrong in my opinion.

Short story

  • Memory consumption grows in each html-to-pdf conversion when using external image.
  • Memory consuption does not grow in each html-to-pdf conversion when there is no external image.
  • Memory consuption does not grow in each html-to-pdf conversion when the image is embedded to the html.

Long story

Aspose.PDF version
<PackageReference Include=“Aspose.PDF” Version=“21.3.0” />

The idea, why I think there is a memory leak
I run the code I submitted, which loads images as external resources (from the web or a local drive).
The template is converted to PDF 50x.
Each conversion extends memory usage. Memory usage is not released in the expected period (garbage collection).

So the 50 conversions show this memory usage. In the screenshot the html-to-pdf conversion is performed during the period mark with yellow markers (GC) in the Process Memory (MB) bar. Or see the Events bar. The memory consuption grows during conversions.

Memory dump contains considerable number of objects.

When the resource is not loaded and the html is converted to PDF without the image, the memory consumption does not grow - it is “constant” during the period with yellow markers.

And memory dump contains “a few” objects only.

Searching for a workaround
Using Document.FreeMemory() method has no effect.

I tried these implementations:

  1. Loading resources from external resources (mention above; memory consumption grows)
  2. Forcing resource load to fail (mentioned above; memory consumption does not grow)
  3. Load resources using a custom resource loader (memory consumption also grows)
  4. Embed the image into the html template to eliminate an external resource loading (memory consumption does not grow)

Could I be wrong
Of course I could, this is just an observation, not a proof. Obfuscation discourages memory dump understanding. The observation leads me to the idea there is a memory leak - every external resource grows memory consumption but when the same resource is used as an embedded image the memory consumption does not grow. But there is some implementation of caching resources in Aspose.PDF, it could be OK. But if there is resource caching, I expect resource reuse, not growing memory consumption in every conversion…

Updated steps to reproduce
Updated application: ConsoleApp15.zip (11.1 KB)

Our impacts
During the retrosteps preparation I found there is no problem with memory allocation when the images are embedded to the html. We are now investigating whether we can embed our images to html. I guess we can, so our problem has a solution. This issue I posted to make your product even better.

@kanda

Thanks for elaborating on the issue.

We have replicated the issue in our environment and logged it as PDFNET-49634 in our issue tracking system. We will further look into its details and keep you informed about its resolution status. Please be patient and spare us some time.

We are sorry for the inconvenience caused.