Html to Pdf Issues

Hi, we need to convert some html files into pdfs and add headers and footers to every pdf page , but we seemed to have encountered a few problems.

Regarding Aspose convert from html to pdf there seems to be some issues. First the Aspose version 9.9 was used, but even after the update to 10.1.0 the same issues appeared.

1) We used Aspose.Pdf.Document class with code example from http://www.aspose.com/docs/display/pdfnet/Convert+HTML+to+PDF+Format

to load a stream into the pdf, but the third line of the code below give an error Value cannot be null. Parameter name: path1

HtmlLoadOptions options = new HtmlLoadOptions(basePath);

MemoryStream stream1 = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(content.Content));

Document pdfDocument = new Document(stream1, options);

My question in this case is how can we load an html string without getting the error. Can you provide help on why that error appears.

2) The second approach was to use Aspose.Pdf.Generator.Pdf class, but the problem is that there is no formatting of the text, all text appears same size and too big for a page

Aspose.Pdf.Generator.Pdf pdf = new Pdf();

// specify the Character encoding for for HTML file

pdf.HtmlInfo.CharSet = "UTF-8";

pdf.HtmlInfo.CharsetApplyingLevelOfForce = HtmlInfo.CharsetApplyingForceLevel.UseWhenImpossibleDetectFromContent;

pdf.BindHTML(stream, basePath);

using (MemoryStream ms = new MemoryStream())

{

pdf.Save(ms);

return ms.ToArray();

}

My opinion is that css is not loading, but I’m not sure. Do you have some insights that might help us with this?

3) I’ve also read that there is no support for floating css this was used for footer when some text needs to be aligned to left and some to right. Is there another way to accomplish this with Aspose?


Could you help us with,

Thanks in advance

mihai.runcan:
1) We
used Aspose.Pdf.Document class with code
example from
http://www.aspose.com/docs/display/pdfnet/Convert+HTML+to+PDF+Format

to load a stream into the
pdf, but the third line of the code below give an error Value cannot be null. Parameter name: path1

HtmlLoadOptions options = new HtmlLoadOptions(basePath);

MemoryStream stream1 = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(content.Content));

Document pdfDocument = new Document(stream1, options);

My question in this case is how can we load an html string without getting the error. Can you provide help on why that error appears.

Hi Mihai,


Thanks for contacting support.


I have tested the scenario using Aspose.Pdf for .NET with following code snippet and I am unable to notice any issue. HTML file is properly being converted to PDF format. Can you please share the resource HTML which you are using so that we can test the conversion in our environment.

[C#]

// Read the contents of HTML file into StreamReader object

StreamReader r = File.OpenText(@"c:/pdftest/Untitled1_Filled.html");

HtmlLoadOptions options = new HtmlLoadOptions("c:/pdftest/");

MemoryStream stream1 = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(r.ReadToEnd()));

Document pdfDocument = new Document(stream1, options);

Console.WriteLine(pdfDocument.Pages.Count);

pdfDocument.Save(“c:/pdftest/FileConverted.pdf”);

mihai.runcan:
2) The second approach was to use Aspose.Pdf.Generator.Pdf class, but the problem is that there is no formatting of the text, all text appears same size and too big for a page

Aspose.Pdf.Generator.Pdf pdf = new Pdf();

// specify the Character encoding for for HTML file

pdf.HtmlInfo.CharSet = "UTF-8";

pdf.HtmlInfo.CharsetApplyingLevelOfForce = HtmlInfo.CharsetApplyingForceLevel.UseWhenImpossibleDetectFromContent;

pdf.BindHTML(stream, basePath);

using (MemoryStream ms = new MemoryStream())

{

pdf.Save(ms);

return ms.ToArray();

}

My opinion is that css is not loading, but I’m not sure. Do you have some insights that might help us with this?

Aspose.Pdf.Generator is legacy approach and we recommend using latest Document Object Model of Aspose.Pdf namespace. Also please note that all the enhancements and bug fixing is being performed in this model.

mihai.runcan:
3) I’ve also read that there is no support for floating css this was used for footer when some text needs to be aligned to left and some to right. Is there another way to accomplish this with Aspose?
Can you please share some resource files which can help us in replicating the issue in our environment. We are sorry for this inconvenience.

Thanks for your answer. I can’t share the html document. Anyway I’ve looked further into this issues I’ve got other issues

1) It seems that the path1 error comes from adding the following css to the page. in tags.

@font-face {
font-family: ‘entypo’;
src: url(‘entypo.eot’);
src: url(‘entypo.eot?#iefix’) format(‘embedded-opentype’),
url(‘entypo.woff’) format(‘woff’),
url(‘entypo.ttf’) format(‘truetype’),
url(‘entypo.svg#entypo’) format(‘svg’);
font-weight: normal;
font-style: normal;
}

Do I miss something in order to load those fonts? In any case you should give the developers a more specific error.


2)Another issue is that when I create the document I get a waiting time of 1 minute.

I got the content of the html into a string in the variable content.Content
The string contains ~ 4000 lines from which about 3000 are css in and also contains some embedded images for example
background: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAGXRFWHRTb2Z0d2FyZ to get an idea.
Is that time expected and if yes can you suggest some ways to reduce the time?

Below is the code to create the document from the string.
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath1);
MemoryStream stream1 = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(content.Content));
Document doc = new Document(stream1, htmloptions);

3) Regarding footing the html sections looks like this:


Name of the application


Report created on: 05/03/2015     ($p of $P)







($p of $P) is current page of number of pages
Could you help me on how to put this in the footer of the pages?

Thanks,

mihai.runcan:
Thanks for your answer. I can't share the html document. Anyway I've looked further into this issues I've got other issues

1) It seems that the path1 error comes from adding the following css to the page. in tags.

@font-face {
font-family: 'entypo';
src: url('entypo.eot');
src: url('entypo.eot?#iefix') format('embedded-opentype'),
url('entypo.woff') format('woff'),
url('entypo.ttf') format('truetype'),
url('entypo.svg#entypo') format('svg');
font-weight: normal;
font-style: normal;
}

Do I miss something in order to load those fonts? In any case you should give the developers a more specific error.
Hi Mihai,

Thanks for sharing the details.

I have tested the scenario and I am able to notice the same problem. For the sake of correction, I have logged this problem as PDFNEWNET-38331 in our issue tracking system. We will further look into the details of this problem and will keep you updated on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

[C#]

// Read the contents of HTML file into StreamReader object

StreamReader r = File.OpenText(@"c:/pdftest/51x.html");

HtmlLoadOptions options = new HtmlLoadOptions("c:/pdftest/");

options.PageInfo.Height = 600;

options.PageInfo.Width = 400;

MemoryStream stream1 = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(r.ReadToEnd()));

Document pdfDocument = new Document(stream1, options);

pdfDocument.Save(“c:/pdftest/51x.pdf”);

mihai.runcan:
2)Another issue is that when I create the document I get a waiting time of 1 minute.

I got the content of the html into a string in the variable content.Content
The string contains ~ 4000 lines from which about 3000 are css in and also contains some embedded images for example
background: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAAGXRFWHRTb2Z0d2FyZ to get an idea.
Is that time expected and if yes can you suggest some ways to reduce the time?

Below is the code to create the document from the string.
HtmlLoadOptions htmloptions = new HtmlLoadOptions(basePath1);
MemoryStream stream1 = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(content.Content));
Document doc = new Document(stream1, htmloptions);

Hi Mihai,

The time taken by API to perform HTML to PDF conversion depends upon the contents being transformed inside PDF document. In order for us to test the scenario, we request you to please share the resource/input HTML so that we can test the conversion in our environment.

mihai.runcan:
3) Regarding footing the html sections looks like this:


Name of the application


Report created on: 05/03/2015     ($p of $P)







($p of $P) is current page of number of pages
Could you help me on how to put this in the footer of the pages?
$p and $P are replaceable symbols but they only work when using Aspose.Pdf.Generator.Text object. However if you need to place Page number count in Footer of document, remove tag containing page numbering information and please try using PageNumberStamp instance to accomplish this requirement.

[C#]

// Read the contents of HTML file into
StreamReader object
<o:p></o:p>

StreamReader r = File.OpenText(@"c:/pdftest/51x.html");

HtmlLoadOptions options = new HtmlLoadOptions("c:/pdftest/");

MemoryStream stream1 = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(r.ReadToEnd()));

Document pdfDocument = new Document(stream1, options);

Console.WriteLine(pdfDocument.Pages.Count);

foreach (Aspose.Pdf.Page current_page in pdfDocument.Pages)

{

//create page number stamp

PageNumberStamp pageNumberStamp = new PageNumberStamp();

pageNumberStamp.Format = "Report created on: 05/03/2015 (" + current_page.Number + " of " + pdfDocument.Pages.Count + ")";

pageNumberStamp.BottomMargin = 10;

pageNumberStamp.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Center;

pageNumberStamp.StartingNumber = 1;

//add stamp to particular page

pdfDocument.Pages[current_page.Number].AddStamp(pageNumberStamp);

}

pdfDocument.Save(“c:/pdftest/51x.pdf”);

The issues you have found earlier (filed as PDFNET-38331) have been fixed in Aspose.PDF for .NET 20.3.