Question about convert Html to Pdf

I'd like to convert HTML to PDF using Aspose.pdf. According to what I read in: https://docs.aspose.com/display/pdfnet/Convert+HTML+to+PDF+Format

When I run the code below, it says "Cannot find the source file 'C:\...\bin\Debug\input.html'" when new Document(). Why not search the file in the basePath - "C:\temp\"? Does anyone know what's wrong with it?

// Specify the The base path/url for the html file which serves as images database
String basePath = "C:/temp/";
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath);
// Load HTML file
Document doc = new Document("input.html", htmloptions);
// Save HTML file
doc.Save("output.pdf");

Hi,


Thank you for contacting support. When you are initializing the Document object, then kindly give a complete path and name of the source HTML file. If it is available in the base directory, then concatenate base path before the HTML file name. It tries to find out the source HTML file from bin directory of the project and fails. Please let us know in case of any further assistance or questions.

Hi,


Thanks for your reply. I have another question. Could I use Aspose.pdf to convert html to thumbnail image directly? Or I have to convert html to pdf then convert to image (like jpeg)?

Hi,


Thank you for the inquiry. You can import an HTML file into Aspose.Pdf API, and then save pages of a PDF document to JPEG format. In this way, you do not need to save HTML in the PDF format. Please try the following code:

[.NET, C#]
<span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Specify the The base path/url for the html file which serves as images database<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>string<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> basePath = @“C:\Pdf\test63”;<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
HtmlLoadOptions htmloptions = <span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> HtmlLoadOptions(basePath);<br style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”><span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Load HTML file<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
Document pdfDocument = <span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> Document(basePath + “Input.html”, htmloptions);<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// convert PDF pages to images<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>for<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> (<span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>int<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> pageCount = 1; pageCount <= pdfDocument.Pages.Count; pageCount++)<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
{<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>using<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> (FileStream imageStream = <span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> FileStream(basePath + “image” + pageCount + “_out” + “.jpg”, FileMode.Create))<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
{<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Create JPEG device with specified attributes<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Width, Height, Resolution, Quality<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Quality [0-100], 100 is Maximum<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
<span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Create Resolution object<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
Resolution resolution = <span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> Resolution(300);<br style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”><span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> JpegDevice jpegDevice = <span class=“kwrd” style=“color: rgb(0, 0, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> JpegDevice(resolution, 100);<br style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”><span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> <span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Convert a particular page and save the image to stream<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
jpegDevice.Process(pdfDocument.Pages[pageCount], imageStream);<br style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”><span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”> <span class=“rem” style=“color: rgb(0, 128, 0); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>// Close stream<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
imageStream.Close();<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
}<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>
}
<span style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre; background-color: rgb(255, 255, 255);”>

Excellent. Thank you very much.

Hi,

Could I ask a question about HTML to PDF?

I find the logo image can not be loaded when convert html to pdf or jpeg.

The logo image is stored as the api url in html, like

<img id=“form-logo” alt=“logo” src=“[http://localhost/…/Logos/31ec0dda-badc-464b-a895-a78501430b8f/Image](http://localhost/BPM/API/a/FormLogos/31ec0dda-badc-464b-a895-a78501430b8f/Image)” style=“width: 68px; float:left; margin-right: 30px;”>,

There is no problem for html to show this logo-image, but when converting html to pdf, it seems Aspose.pdf cannot load it.

By the way, I used the code you show me to do the job, like

[.NET, C#]

// Specify the The base path/url for the html file which serves as images database
string basePath = @"C:\Pdf\test63\";

HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath);
// Load HTML file
Document pdfDocument = new Document(basePath + "Input.htm", htmloptions);

// convert PDF pages to images
for (int pageCount = 1; pageCount <= pdfDocument.Pages.Count; pageCount++)
{
    using (FileStream imageStream = new FileStream(basePath + "image" + pageCount + "_out" + ".jpg", FileMode.Create))
    {
        // Create JPEG device with specified attributes

        // Width, Height, Resolution, Quality

       // Quality [0-100], 100 is Maximum

       // Create Resolution object

        Resolution resolution = new Resolution(300);        
        JpegDevice jpegDevice = new JpegDevice(resolution, 100);
        // Convert a particular page and save the image to stream

        jpegDevice.Process(pdfDocument.Pages[pageCount], imageStream);
        // Close stream

        imageStream.Close();
    }
}

Hi,


Thank you for contacting support. We have added an image source in the HTML (as you mentioned), and then converted it to the JPEG, we can see this image icon in the output JPEG. We have attached both HTML file and output JPEG to this reply. Kindly let us know which Aspose.Pdf for .NET API version you are using. If this does not help, then please send us a sample HTML file. We will investigate and share our findings with you.

Hi, Imran,

Thanks for your reply. I tried to add the image source that you used, it also works with my code. I notice the difference:

  1. The image source in my code - “<span style=“background-color: rgb(255, 255, 255); color: rgb(34, 34, 34); font-family: Consolas, “Lucida Console”, “Courier New”, monospace; font-size: 12px; white-space: pre-wrap;”>http://localhost/.../FormLogos/d3a1610d-bbd1-4dd3-94f3-a784016bd080/Image” is going to visit the server, currently running locally. I type this address in the browser, which is going to download the image .jpg to the local disk. Instead, for the image source you used, the browser is going to show it directly. Does this matter?

  2. Is there any time limit for loading the html when converting html to pdf/jpeg? I’m thinking if it is possible that the html is not fully loaded. Because it’s gonna take some time to load the logo image in this way - “<span style=“background-color: rgb(255, 255, 255); color: rgb(34, 34, 34); font-family: Consolas, “Lucida Console”, “Courier New”, monospace; font-size: 12px; white-space: pre-wrap;”>http://localhost/.../FormLogos/d3a1610d-bbd1-4dd3-94f3-a784016bd080/Image”, which is a little bit heavy.

How do you think?

Hi,


Thank you for the inquiry. One way round is that you can download this image with a web request, embed an image in the HTML, and then convert it to the JPEG format. Anyways, it would be great if you can provide such a public URL, furthermore, we are replicating your scenario in our environment and let you know about our findings soon.

Hi Imran,


I have two questions about converting html to pdf/jpeg.

1. Could it load the external .js file in html? In my work, the .js file doesn’t work in html.

2. Is there any way to show if the html is loaded ready when converting html to pdf/jpeg?
I’m thinking maybe the html is not loaded ready when it takes screenshot for converting, if it support .js file.

Thanks

Hi Imran,


This is a follow-up question:

Does Aspose use the internal rendering engine, or some public rendering engine like Webkit?

Thanks
1 Like

Hi,


Thank you for the inquiry. Aspose.Pdf for .NET API has its own rendering engine and do not use any public rendering engine. It can load external resources, including Javascripts, and when we load an HTML into the Document instance, then it means the HTML is loaded with external resources. Furthermore, you can customize the way of loading external resources. Please refer to this help topic: Convert HTML to PDF - Resource Loading Callback

Hi,


In my case, I notice that the Aspose.pdf doesn’t execute the Javascript built in the html file when converting html to pdf/jpeg. It seems the Javascript part was just ignored in conversion process.
Do you know anything about this?

Thanks

Hi,


Thank you for the details. We can find a few already logged tickets for not being able to run JavaScript in the output PDF documents. We have plans to improve that area, so kindly send us your source HTML file, which includes JavaScript. We will investigate and share our findings with you. Your response is awaited.

Hi,


Thanks for your reply. I attached a sample HTML file and the output image. You can see the difference:

1. The pagination bar, which should be controlled by javascript (thumbnail.js). We can see the javascript doesn’t execute.
2. The Previous and Next button also doesn’t show well.

Thanks,

Hi,


Thank you for posting source files. We managed to replicate the said problem in our environment. It has been logged under the ticket ID PDFNET-42909 in our bug tracking system. We have linked your post to this ticket and will keep you informed regarding any available updates. We are sorry for the inconvenience caused.