Html page URL to PDF file

Hi Aspose Team,

I want to generate a pdf file using any given url. How can i acheive it.
For example: If i give http://yahoo.com/ Url then Aspose should generate pdf file for that given Url.



Please provide any workable solution to convert the html page (via url) to pdf.

Thanks in Advance.

Hi There,


Thanks for contacting support.

Please use following code snippet for converting web page to PDF.

JAVA

URL pageUrl = new URL(“https://www.yahoo.com”);
URLConnection urlConnection = pageUrl.openConnection();
Document doc = new Document(urlConnection.getInputStream(), new HtmlLoadOptions(“https://www.yahoo.com”));
doc.save(dataDir + “output.pdf”);


If you need further assistance, please feel free to contact us.

Best Regards,

Thank you very much for the quick turn but the URL to PDF conversion includes only inline CSS.
External CSS declared are not included in the conversion process.
Is there a workaround in HTML or URL to PDF conversion along with external CSS.

Please provide any workable solution to convert the html page (via url) to pdf inclusive of External CSS.

Thanks in Advance.

Hi There,


Thanks for your inquiry. The HtmlLoadOptions class supports the feature to load HTML contents and then save the output in PDF format. Instantiate an HtmlLoadOptions object and pass the base path/URL argument that serves as a form of database when converting HTML to PDF. Please see following code snippet for a reference

JAVA
// Specify the The base path/url for the html file which serves as images database
String basePath = “pdftest”;
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath);


If you still face issues, please share the sample URL which you are trying to convert to PDF.

We are sorry for this inconvenience.

Best Regards,
Hello,
We found an issue while trying to convert html files to PDF using aspose .
The issue is that HTML files that contain an embedded image , don’t have the image showing after conversion to PDF when the files are located (and addressed in the code) as a network path.
I reproduced the issue using the following code snippet:
// Specify the The base path/url for the html file which serves as images database
String basePath = @"\\DELLWRO\share2\00000001_00000001_Body.htm";
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath);
// Load HTML file
Document doc = new Document(@"\\DELLWRO\share2\00000001_00000001_Body.htm", htmloptions);
// Save pdf file
doc.Save("output.pdf");
If the file at @"\\DELLWRO\share2\00000001_00000001_Body.htm contains an embedded image, the image is not shown in the outputfile.
Running the same code, but with normal local paths gives the correct result (with embedded image in the result):
// Specify the The base path/url for the html file which serves as images database
String basePath = @"E:\temp\share2\00000001_00000001_Body.htm";
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath);
// Load HTML file
Document doc = new Document(@"E:\temp \share2\00000001_00000001_Body.htm", htmloptions);
// Save pdf file
doc.Save("output.pdf");

Hi There,


Thanks for sharing further details. I have tested the scenario and have not noticed any issue. Please make sure you are setting the license before using the above code snippet.

If you still face issue, please share your project folder for further investigation.

We are sorry for this inconvenience.

Best Regards,

Tried this in aspose-pdf-18.8, get the following error:

java.util.regex.PatternSyntaxException: Unknown look-behind group near index 20
(?[0-9A-Fa-f]{6})|(?[0-9A-Fa-f]{3})\s*
^
at java.util.regex.Pattern.error(Pattern.java:1713)
at java.util.regex.Pattern.group0(Pattern.java:2505)
at java.util.regex.Pattern.sequence(Pattern.java:1806)
at java.util.regex.Pattern.expr(Pattern.java:1752)
at java.util.regex.Pattern.compile(Pattern.java:1460)
at java.util.regex.Pattern.(Pattern.java:1133)
at java.util.regex.Pattern.compile(Pattern.java:847)
at com.aspose.pdf.internal.l1671.I17.(Unknown Source)
at com.aspose.pdf.internal.l781.I4l.lif(Unknown Source)
at com.aspose.pdf.internal.l781.I4l.l01(Unknown Source)
at com.aspose.pdf.internal.l781.I4l.lIF(Unknown Source)
at com.aspose.pdf.internal.l781.I4l.lif(Unknown Source)
at com.aspose.pdf.internal.l781.I07.lif(Unknown Source)
at com.aspose.pdf.internal.l774.II.lif(Unknown Source)
at com.aspose.pdf.internal.l774.II.lif(Unknown Source)
at com.aspose.pdf.internal.l774.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l781.I07.lif(Unknown Source)
at com.aspose.pdf.internal.l83l.Il.lif(Unknown Source)
at com.aspose.pdf.internal.l83l.Il.lif(Unknown Source)
at com.aspose.pdf.internal.l1117.Il.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1067.I7.lif(Unknown Source)
at com.aspose.pdf.internal.l1034.I84.lif(Unknown Source)
at com.aspose.pdf.I161.lif(Unknown Source)
at com.aspose.pdf.I161.lif(Unknown Source)
at com.aspose.pdf.ADocument.lif(Unknown Source)
at com.aspose.pdf.ADocument.(Unknown Source)
at com.aspose.pdf.Document.(Unknown Source)

@jzp

Thank you for contacting support.

Would you please share a narrowed down code snippet reproducing this issue along with the source file, if any, so that we may try to reproduce and investigate it in our environment.