Adding HtmlFragment in Table and Paragraph is slow

Hello,

Our product uses lots of HtmlFragment inside tables and paragraphs and it is causing major performance issues for use in rendering the pdf. Is there anything we need to do to get better performance using HtmlFragments in Tables and Paragraphs?

For example using Aspose.PDF 18.5

This is taking ~20 seconds:

        Stopwatch sw = new Stopwatch();
        sw.Start();
        Document pdf = new Document();
        Page page = pdf.Pages.Add();
        Table tblMain = new Table();
        for (int i = 0; i < 100; i++)
        {
            Row row = tblMain.Rows.Add();
            Cell cell = row.Cells.Add();
            cell.Paragraphs.Add(new HtmlFragment("<b>A</b>"));
        }
        page.Paragraphs.Add(tblMain);
        pdf.Save("C:/temp/HtmlFragmenttestTable.pdf");
        sw.Stop();
        Console.WriteLine(sw.Elapsed.TotalSeconds);
        Console.ReadLine();

This is taking ~20 seconds:

        Stopwatch sw = new Stopwatch();
        sw.Start();
        Document pdf = new Document();
        Page page = pdf.Pages.Add();
        for (int i = 0; i < 100; i++)
        {
            page.Paragraphs.Add(new HtmlFragment("<b>A</b>"));
        }
        pdf.Save("C:/temp/HtmlFragmentTestParagraph.pdf");
        sw.Stop();
        Console.WriteLine(sw.Elapsed.TotalSeconds);
        Console.ReadLine();

Machine is a i5-5300U with 12GB of ram.

1 Like

@DenisW

Thank you for contacting support.

We would like to share with you that whole Document Object Model (DOM) is loaded into memory and necessary resources are allocated when program executes for the first time, this contributes to some additional span of time. You can notice improved performance while executing these snippets more than once as loading of DOM and allocation of resources will not consume additional time in following executions. We hope you will notice improved performance while testing the API in suggested manner. In case you are not satisfied, please feel free to get back to us with new results.

Thank you for a reply,

That makes sense being a .net technology.

However I still have a ~14 second time per 100 html fragments on subsequent runs. Is there any way to get this optimized even on subsequent runs (and the initial run)? Many of our scenarios that use Aspose.PDF use about a hundred HtmlFragments and the HtmlFragments when removed or replaced with TextFragements show a dramatic improvement (~400 times faster) than the original time.

I know this is not really a good replacement for htmlfragment but I would not have expected that htmlfragments would be this order of magnitude higher.

These are the results from this code:

Run 1 of 5: 100 HtmlFragment took: 19.9843585 seconds
Run 2 of 5: 100 HtmlFragment took: 14.3057354 seconds
Run 3 of 5: 100 HtmlFragment took: 13.9145077 seconds
Run 4 of 5: 100 HtmlFragment took: 14.1385618 seconds
Run 5 of 5: 100 HtmlFragment took: 14.1688572 seconds
Run 1 of 5: 100 TextFragment took: 1.6341424 seconds
Run 2 of 5: 100 TextFragment took: 0.0364214 seconds
Run 3 of 5: 100 TextFragment took: 0.0298986 seconds
Run 4 of 5: 100 TextFragment took: 0.0225426 seconds
Run 5 of 5: 100 TextFragment took: 0.0316697 seconds

        Aspose.Pdf.License license = new Aspose.Pdf.License()
        {
            Embedded = true
        };            
        license.SetLicense("Aspose.Pdf.lic");
        var runs = 5;

        for (int t = 1; t <= runs; t++)
        {
            var sw = new Stopwatch();
            sw.Start();
            var pdf = new Document();
            var page = pdf.Pages.Add();
            for (int i = 0; i < 100; i++)
            {
                page.Paragraphs.Add(new HtmlFragment("<b>A</b>"));
            }
            pdf.Save($"C:/temp/HtmlFragmentTest{t}.pdf");
            sw.Stop();
            Console.WriteLine($"Run {t} of {runs}: 100 HtmlFragment took: {sw.Elapsed.TotalSeconds} seconds");   
        }

        for (int t = 1; t <= runs; t++)
        {
            var sw = new Stopwatch();
            sw.Start();
            var pdf = new Document();
            var page = pdf.Pages.Add();
            for (int i = 0; i < 100; i++)
            {
                page.Paragraphs.Add(new TextFragment("A"));
            }
            pdf.Save($"C:/temp/TextFragmentTest{t}.pdf");
            sw.Stop();
            Console.WriteLine($"Run {t} of {runs}: 100 TextFragment took: {sw.Elapsed.TotalSeconds} seconds");
        }
        Console.ReadLine();

@DenisW

Thank you for elaborating it further.

We have recorded your concerns in the ticket that has been logged on your behalf and pertains to slow performance of HtmlFragment. The ticket ID PDFNET-44828 has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

Hello,

Sorry to jump into this conversation, but this is the latest thread on HtmlFragment slowness.

Are there any updates on this one?

More, is there any other way of inserting HTML in a PDF?

Thank you.

@gwert

We are afraid PDFNET-44828 is currently pending for investigations and will be scheduled on its due turn. We will notify you as soon as it will be resolved. We appreciate your patience and cooperation in this regard. Moreover, HTML string can be added to a PDF document using HtmlFragment only.