Aspose Pdf HtmlFragment MEMORY LEAK


#1

Hello,

Our enterprise solution used Aspose Pdf Generator (so version 17.4) in order to generate pdf files. We decided to upgrade to the latest version of Aspose.Pdf (19.6) and had to rewrite some of the code to use the Aspose DOM API instead of Aspose Generator.

After using the new version we noticed a SIGNIFICANT PERFORMANCE DOWNGRADE : pdf saving took longer and used more RAM which wouldn’t be disposed of. In our integration environment, the same operation which generated a pdf using AsposeGenerator (and took ~1 second and left no residual memory after garbage collection ) was completed by AsposeDOM in ~20seconds and leaked around 20MB of RAM for a 5 page pdf containing some text fragments, simple horizontal lines and HtmlFragments.

To extract the essence of this issue, i wrote a short code snippet.

The referenced namespaces are:

using System;
using System.Diagnostics;
using System.IO;
using Aspose.Pdf;

the used nuget package is package id=“Aspose.PDF” version=“19.6.0” targetFramework=“net461”

The sampleHtml1.html file contains a very very basic html:

<html>
	<head>
	</head>
	<body>
		<p>HtmlParagraph1</p>
	</body>
</html>

The code is posted at the end of this article, so anyone can test.

After 50 iterations of creating a 33KB pdf (which contained a single word), after forcing garbage collection, 40 MB of RAM were not disposed of. Doing memory snapshots (using the Diagnostic Tools in Visual Studio 2019) before and after the 50 iterations revealed a lot of Aspose objects…

40MB might not seem a lot, but the test was performed with a very small input !

So for a production server which is a long-running process, Aspose Pdf will not scale properly and is basically unusable because of this memory leak !!

Please review the AsposePDF DOM API and make the necessary improvements !

AsposePdfConsoleApp.zip (4.1 KB)

class Program
{
    static void Main(string[] args)
    {
        var license = new Aspose.Pdf.License(); license.SetLicense(@"D:\AsposeLicense.txt");

        int iterations = 50;

        // warmup
        Test1();
        Console.WriteLine($"Warmup completed. In VS Diagnostic Tools take a memory snapshot, then press any key to run the same test {iterations} more times");
        Console.ReadKey(true);

        var totalRamInitial = GC.GetTotalMemory(false);

        for (int i = 0; i < iterations; i++)
        {
            Test1();
            GC.Collect();
            GC.WaitForPendingFinalizers();
        }

        var totalRamFinal = GC.GetTotalMemory(false);
        Console.WriteLine($"Leaked memory (KB)  {(int)((totalRamFinal - totalRamInitial) / 1024)}");

        Console.ReadKey();
    }

    private static void Test1()
    {
        PrintPdfToFile("sampleHtml1.html", "outputPdf1.pdf");
    }

    private static void PrintPdfToFile(string sampleHtmlFilePath, string outputPdfName)
    {
        using (FileStream fs = File.Create(outputPdfName))
        {
            PrintPdfToStream(File.ReadAllText(sampleHtmlFilePath), fs);
        }
    }

    private static void PrintPdfToStream(string sampleHtml, Stream outputStream)
    {
        var sw = Stopwatch.StartNew(); long ramInitial = GC.GetTotalMemory(false);
        long durationRender, durationSave;
        long ramAfterRender, ramAfterSave;

        using (var pdf = new Aspose.Pdf.Document())
        {
            var page = pdf.Pages.Add();
            var htmlFragment = new HtmlFragment(sampleHtml);
            page.Paragraphs.Add(htmlFragment);

            durationRender = sw.ElapsedMilliseconds; sw.Restart();
            ramAfterRender = GC.GetTotalMemory(false);

            pdf.Save(outputStream);
            pdf.FreeMemory();

            durationSave = sw.ElapsedMilliseconds; sw.Restart();
            ramAfterSave = GC.GetTotalMemory(false);

            Debug.WriteLine($"durations(ms)      Render {durationRender}    Save {durationSave}  ");
            Debug.WriteLine($"deltaRam(KB)  AfterRender {(int)((ramAfterRender - ramInitial) / 1024)}    AfterSave {(int)((ramAfterSave - ramInitial) / 1024)}  ");
        }
    }
}

#2

@bpopeti

Thank you for contacting support.

We have logged an investigation ticket with ID PDFNET-46603 in our issue management system for further investigations. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.


#3

Hello,

Any word on the status of this ticket ?

Thanks


#4

@bpopeti

Please note that the ticket has been logged under free support model and will be investigated on first come first serve basis. Therefore, it may take some months to resolve. As soon as we have some definite updates or ETA regarding ticket resolution, we will let you know.

Moreover, we also offer Paid Support, where issues are used to be investigated with higher priority. Our customers, who have paid support subscription, report their issue there which are meant to be investigated urgently. In case your reported issue is a blocker, you may please consider subscribing for Paid Support. For further information, please visit Paid Support FAQs.