CPU consumption when converting HTML to PDF

Hi Team,

We are having a bit of trouble… when converting HTML to PDF, the CPU consumption is through the roof.

Let me explain a bit better: We have a Soap WS in Framework 4.8 running in IIS. That receives “a lot” of requests every day. And more than once a day we must restart the application pool because it is using all of the CPU.

We are using Aspose version 24.10

Looking at the Task manager it seems that the w3w process is “stuck”

Code:

SaveDocumentToOutput(Aspose.Pdf.Document document, ref HtmlToPDFOutput output)

{

using (var outStream = new MemoryStream())

{

document.Save(outStream, Aspose.Pdf.SaveFormat.Pdf);

output.PDF = outStream.Length > 0 ? outStream.ToArray() : Array.Empty();

}

}

Is there any thing I can do to minimize the CPU consumption?

A bit more informations.
after calling the WS 1000 times we got 2 calls stuck, this two are using aprox 50% of the CPU
image.png (16.6 KB)

image.png (10.7 KB)

@FredericoS

Have you confirmed that the issue is occurring for any kind of HTML document and not only with specific input HTML? Furthermore, have you tried saving the results to file path instead of memory stream? Please share some sample file(s) for our reference and steps to reproduce the issue so that we can address it accordingly.

I am experiencing the same issue. Here is a code snippet, which works with Aspose.Pdf version 22.5.0 but not 24.11.0. In the latter case it consumes a lot of CPU and never finishes.

            // Works with Aspose.Pdf Version 22.5.0 but not 24.11.0
            for(int i = 0; i < 1000; i++)
            {
                Stopwatch stopwatch = new Stopwatch();
                stopwatch.Start();
                var docText = "<!DOCTYPE html>\r\n<html>\r\n    <head>\r\n        <title>Aspose-Pdf-Test</title>\r\n    </head>\r\n    <body>\r\nHello World!\r\n    </body>\r\n</html>";
                var doc = new HtmlDocument();
                doc.LoadHtml(docText);
                var mailHtml = doc.DocumentNode.InnerHtml;
                var stream = new MemoryStream();
                var streamMailHtml = new MemoryStream(Encoding.UTF8.GetBytes(mailHtml));
                HtmlLoadOptions objLoadOptions = new HtmlLoadOptions();
                objLoadOptions.PageInfo.Margin.Bottom = 30;
                objLoadOptions.PageInfo.Margin.Left = 30;
                objLoadOptions.PageInfo.Margin.Right = 30;
                objLoadOptions.PageInfo.Margin.Top = 30;
                var docPdf = new Document(streamMailHtml, objLoadOptions);
                docPdf.Save(stream);
                TimeSpan ts = stopwatch.Elapsed;
                Console.WriteLine($"Pdf-Generation took {ts.TotalSeconds} sec.");
            }

Best
Grigory

Hi ali,

Have you confirmed that the issue is occurring for any kind of HTML document and not only with specific input HTML?

The HTML is always the same, only difference being the for example the “username”,
and in 90% of request, we have no problem.

Furthermore, have you tried saving the results to file path instead of memory stream?

this is a WS that as to return the PDF, there for we don’t want to save anything on disk.

DOCTYPE html.docx (19.0 KB)

@FredericoS

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-58671

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Hi Ali,

I was thinking, Is there a way to check if the request is in this “stuck” in this unfinnished state? And kill the process.

thanks

@FredericoS

We will surely investigate the feasibility of it and as soon as the ticket is investigated or we have any information regarding its resolution, we will let you know via this forum thread. Please be patient and spare us some time.

We are sorry for the inconvenience.