Example WebPageToPDF has error

Aspose.PDF for .NET
I downloaded Aspose.Pdf.Examples.CSharp.sln.
I get an error right out of the box in
Apose\Examples\CSharp\AsposePDF\DocumentConversion\WebPageToPDF.cs
(and also same error in ProvideCredentialsDuringHTMLToPDF.cs)

namespace Aspose.Pdf.Examples.CSharp.AsposePDF.DocumentConversion
{
  public class WebPageToPDF
  {
    public static void Run()
    {
      try
      {
        string dataDir = RunExamples.GetDataDir_AsposePdf_DocumentConversion();
        WebRequest request = WebRequest.Create("https://en.wikipedia.org/wiki/Main_Page");
        request.Credentials = CredentialCache.DefaultCredentials;
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();

This last line gives this error -
“The underlying connection was closed: An unexpected error occurred on a receive.”

Aspose.PDF version is 20.3.0.0
I am using Visual Studio 19

@hshlom

Would you kindly make sure to use 24.3 version of the API and in case you still notice any issues, please let us know.

Thank you very much for your reply.

All the 24.3 Download clicks on Download .NET Component DLL to Process PDF | Aspose.PDF API
give a 404 Page Not Found.
On both Chrome and Edge.

I was able to download 24.2 (dlls only), but get the same error.

@hshlom

We are checking this issue of downloading the DLLs from our site. Meanwhile, please check the below code snippet and the attached PDF that we obtained using it in our environment and did not observe any exception:

var url = "https://en.wikipedia.org/wiki/Main_Page";

WebClient client = new WebClient();
string html = client.DownloadString(url);

System.Net.WebRequest request = System.Net.WebRequest.Create(url);
// If required by the server, set the credentials.
request.Credentials = System.Net.CredentialCache.DefaultCredentials;
// Time out in miliseconds before the request times out
request.Timeout = 1000;
// Get the response.
System.Net.HttpWebResponse response = (System.Net.HttpWebResponse)request.GetResponse();
// Get the stream containing content returned by the server.
Stream dataStream = response.GetResponseStream();
// Open the stream using a StreamReader for easy access.
StreamReader reader = new StreamReader(dataStream);
// Read the content.
string responseFromServer = reader.ReadToEnd();
reader.Close();
dataStream.Close();
response.Close();
      
MemoryStream stream = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(responseFromServer));
HtmlLoadOptions options = new HtmlLoadOptions(url);
options.PageInfo.IsLandscape = false;
options.PageInfo.Margin = new MarginInfo() { Bottom = 0, Left = 0, Right = 0, Top = 0 };
options.PageInfo.Height = Aspose.Pdf.PageSize.PageLetter.Height;
options.PageInfo.Width = Aspose.Pdf.PageSize.PageLetter.Width;
// Load HTML file
Document pdfDocument = new Document(stream, options);
// Save output as PDF format
pdfDocument.Save(dataDir + "WebpageToPdf.pdf");

WebpageToPdf.pdf (1.0 MB)

Thank you.
That works, except for some html overlaying other html.
I hope I can get around this.
But I have one other related question.

Is there a way to save the file in the user’s Downloads folder?
That path isn’t possible as a string since it is in the C:\Users\UserAccountName.
I haven’t seen this in the examples.

@hshlom

You can use below code snippet for that purpose:

Aspose.Pdf.Document pdfdoc = new Aspose.Pdf.Document();
MemoryStream stream = new MemoryStream();
doc.Save(stream, Word.SaveFormat.Pdf);
pdfdoc.Save(stream);
Response.Clear();
Response.ClearHeaders();
Response.ClearContent();
Response.Charset = “UTF-8”;
Response.AddHeader(“content-length”, stream.Length.ToString());
Response.AddHeader(“content-disposition”, String.Format(“attachment;filename=TestDocument.pdf”, “FileName”)); // OR use String.Format(“inline;filename=TestDocument.pdf”, “FileName”));
Response.ContentType = “application/pdf”; Response.BinaryWrite(stream.ToArray());
Response.Flush();
Response.End();

Thank you once again for your prompt replies.
I appreciate your time.
However, when I try to use this code, I get 2 different errors,
Cannot access a closed Stream and Stream was not writable.

Converted to VB.Net, I tried two different techniques, defining stream As MemoryStream
with and without “Using”.
I show the errors on the lines where I get them.

Public Shared Sub ConvertUrlToPdf(ByVal url As String, ByVal fileName As String)

    Dim client As System.Net.WebClient = New System.Net.WebClient()
    Dim html As String = client.DownloadString(url)
    Dim request As Net.WebRequest = Net.WebRequest.Create(url)
    request.Credentials = Net.CredentialCache.DefaultCredentials
    request.Timeout = 100000

    Dim resp As Net.HttpWebResponse = CType(request.GetResponse(), Net.HttpWebResponse)
    Dim dataStream As Stream = resp.GetResponseStream()
    Dim reader As StreamReader = New StreamReader(dataStream)
    Dim responseFromServer As String = reader.ReadToEnd()
    reader.Close()
    dataStream.Close()
    resp.Close()

    Dim options As Aspose.Pdf.HtmlLoadOptions = New Aspose.Pdf.HtmlLoadOptions(url)
    options.PageInfo.IsLandscape = False
    options.PageInfo.Margin = New Aspose.Pdf.MarginInfo() With {
        .Bottom = 0,
        .Left = 0,
        .Right = 0,
        .Top = 0
    }
    options.PageInfo.Height = Aspose.Pdf.PageSize.PageLetter.Height
    options.PageInfo.Width = Aspose.Pdf.PageSize.PageLetter.Width

’ Technique #1
Dim stream As MemoryStream = New MemoryStream(Encoding.UTF8.GetBytes(responseFromServer))
Dim pdfDoc = New Aspose.Pdf.Document(stream, options)
Dim streamLength As String = stream.Length.ToString() ‘–>Cannot access a closed Stream.
pdfDoc.Save(stream, Aspose.Pdf.SaveFormat.Pdf) ’ -->Stream was not writable.’

’ Technique #2
Using stream As New MemoryStream(Encoding.UTF8.GetBytes(responseFromServer))
Dim pdfDoc = New Aspose.Pdf.Document(stream, options)
pdfDoc.Save(stream, Aspose.Pdf.SaveFormat.Pdf) ‘–>Stream was not writable.’

        Dim httResponse As System.Web.HttpResponse = System.Web.HttpContext.Current.Response
        httResponse.Clear()
        httResponse.ClearHeaders()
        httResponse.ClearContent()
        httResponse.Charset = "UTF-8"
        httResponse.AddHeader("content-length", stream.Length.ToString())   'Cannot access a closed Stream.
        httResponse.AddHeader("Content-Disposition", ("attachment; filename=" & (fileName & ".pdf")))
        httResponse.ContentType = "application/pdf"
        httResponse.BinaryWrite(stream.ToArray())
        httResponse.Flush()
        httResponse.End()
    End Using

End Sub

@hshlom

You only need to write the PDF document to a newly created MemoryStream object as we suggested in the code we shared above. We looked at your code and it looks like you are trying to save the PDF document to an existing stream in Technique 2 and in Technique 1, you are no saving the PDF document any where.

Hello again.
You have some typos in the code you are referring to, so it was hard to use. In particular -
MemoryStream stream = new MemoryStream();
doc.Save(stream, Word.SaveFormat.Pdf);
pdfdoc.Save(stream);

The following code saves a pdf file that contains javascript and html text inside of it, with possibly pdf language at the top.
But the PDF file dopes not open because it gets the error -
This file cannot be opened because it has no pages.
Please help.
Thank you.

Dim client As System.Net.WebClient = New System.Net.WebClient()
Dim html As String = client.DownloadString(url)
Dim request As Net.WebRequest = Net.WebRequest.Create(url)
request.Credentials = Net.CredentialCache.DefaultCredentials
request.Timeout = 100000

Dim resp As Net.HttpWebResponse = CType(request.GetResponse(), Net.HttpWebResponse)
Dim dataStream As Stream = resp.GetResponseStream()
Dim reader As StreamReader = New StreamReader(dataStream)
Dim responseFromServer As String = reader.ReadToEnd()
reader.Close()
dataStream.Close()
resp.Close()

Dim stream As MemoryStream = New MemoryStream(Encoding.UTF8.GetBytes(responseFromServer))
Dim pdfDoc = New Aspose.Pdf.Document()
pdfDoc.Save(stream, Aspose.Pdf.SaveFormat.Pdf)

Dim httResponse As System.Web.HttpResponse = System.Web.HttpContext.Current.Response
httResponse.Clear()
httResponse.AddHeader(“Content-Type”, “application/pdf”)
httResponse.AddHeader(“Content-Disposition”, (“attachment; filename=” & (fileName & “.pdf” & (“; size=” & stream.Length.ToString))))
httResponse.ContentType = “application/pdf”
httResponse.BinaryWrite(stream.ToArray())
httResponse.Flush()
httResponse.End()

@hshlom

The problem looks in the above section. You need to initialize the Document with HTML stream and HtmlLoadOptions like below:

Dim pdfDoc = New Aspose.Pdf.Document(stream, New HtmlLoadOptions())

Then, you need to save the initialized Document into another stream (output Stream) like below:

Dim outstream As MemoryStream = New MemoryStream()
pdfDoc.Save(outstream, Aspose.Pdf.SaveFormat.Pdf)

Thank you. That was what I needed, the PDF process now works!

1 Like