HTML to PDF conversion taking long time

Hi There,


I have a HTML string from which I generate my pdf, just to put things in perspective, for a 18 page pdf time taken to generate pdf is 15 secs approx, which is really slow. There are no images in my html, only tables and CSS. see the attachement for aspose .net code, and let me know if there is more information you need.

Thanks,
Rushdeep

Hi Rushdeep,

Thanks for your inquiry and sharing code snippet.

I am afraid that you are using an old Aspose.Pdf.Generator approach to generate PDF from HTML. Whereas it is strongly recommended to use new Aspose.Pdf (DOM) approach as old Aspose.Pdf.Generator model is going to be obsolete soon. I have tested the scenario at my side using new approach and did not notice any issue as output file was generated just in 3 seconds. Please check following code snippet that I have used to perform the conversion.

Aspose.Pdf.MarginInfo marginInfo = new Aspose.Pdf.MarginInfo();

marginInfo.Top = 24;

marginInfo.Bottom = 24;

marginInfo.Left = 24;

marginInfo.Right = 24;

var html = @"this is html";

var pdfDocument = new Document();

var currentPage = pdfDocument.Pages.Add();

var htmlFragment = new HtmlFragment(html);

currentPage.Paragraphs.Add(htmlFragment);

pdfDocument.Save(dataDir + "HTMLToPDF1_out.pdf");

You may also find more information regarding HTML to PDF conversion by new DOM approach in our “Convert HTML to PDF” and “Add HTML String using DOM” articles of API documentation. Please try using above approach for conversion process and in a case if you still face any issue please share your sample HTML string so that we can try to test the scenario in our environment and respond you accordingly.

Best Regards,

Hi Asad,


Thanks for your reply, I made the necessary changes as you pointed out, and now the line of code, pdfDocument.Save(@“c:\code\new_test.pdf”); absolutely hangs, I took the time till 20 seconds and then forgot about doing that, as the line of code never moved on to the next line of code, nor it threw any exceptions. And in fact the file that is being written is at 0 kb for the longest of time, which suggests that no content is being written to it, but when I go to delete it, I get a windows error that, the file is locked by the iis process(w3wp.exe).

In a nutshell this new code is even slower (or possibly not even working) as compared to the previous code I had, with the old code at least I had an output, right now there is no output.

Please see the attached files for the code I have now and also the html.

In my code you will see a foreach loop, assuming the loop executes 15 times and each iteration of the loop generated html (attached as aspose_html.zip, please note again html attached is for one iteration only, and the loop runs 15 times.)

Thanks,
Rushdeep

Hi Rushdeep,

Thanks for sharing input HMTL file and more details with us. I have tested the whole scenario as per your requirement and did not notice any issue. I have added the HTML inside the document 15 times in a loop and it only took 30 seconds to save the document. I have used following code snippet to achieve the functionality.

string bodyContent = File.ReadAllText(dataDir + “ASPOSE.HTML”);

Aspose.Pdf.MarginInfo marginInfo = new Aspose.Pdf.MarginInfo();

marginInfo.Top = 24;

marginInfo.Bottom = 24;

marginInfo.Left = 24;

marginInfo.Right = 24;

var html = bodyContent;

var pdfDocument = new Document();

var currentPage = pdfDocument.Pages.Add();

currentPage.PageInfo.Margin = marginInfo;

for (int i = 0; i < 15; i++)

{

    var htmlFragment = new HtmlFragment(html);

    currentPage.Paragraphs.Add(htmlFragment);

}

pdfDocument.Save(dataDir + "HTMLToPDF1_out.pdf");

Please note that I have used Aspose.Pdf for .NET 17.3.0 to implement the functionality and also I have tested the scenario on Windows 10 X64 based system. I have attached an generated output by above code for your reference. I would also like to share that the performance of the API depends upon many factors to be noticed i.e. Structure and Complexity of input file, the version of API, the environment on which you are using API, the system configurations etc.

Please try using above code to add HTML inside PDF document and in case if you still face any issue please share your environment details with us so that we can try to reproduce the issue at our side.

Best Regards,

Hi Asad,


I am using Aspose dll version 10, when I download version 17.3.0, its not compatible with my license file, can you do your test with version 10 and see what the results are ? I have no way of running the code suggested by you on version 17.

Thanks,
Rushdeep

Hi Rushdeep,


Thanks for sharing more information.

rushdeep:
can you do your test with version 10 and see what the results are ?

I have tested the scenario with Aspose.Pdf for .NET 10.0.0 and I am afraid that I did not notice any issue. Though the generated document was different than the document generated by the API version 17.3.0 but the code execution time was just less than 30 seconds. I am attaching the generated output for your reference.

rushdeep:
I am using Aspose dll version 10, when I download version 17.3.0, its not compatible with my license file

I am afraid that you are using a quite old version of the API whereas it is always recommended to use latest version which is 17.0.3. In case if you are facing some issues related to your license, you need to contact Sale Department to renew/upgrade your license.

In case of any further assistance, please feel free to contact us.


Best Regards,