We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Data depending conversion error html -> pdf

Dear Sirs,

we are now testing our prototype using the aspose pdf-lib with realistic data from our customer.

Unfortunately the HTML-blocks that have to rendered in the generated pdf document may have an ugly format as in the example "ugly-html-renamed.zip". Since the Aspose lib seem to do a strong validation of the html, the conversion can crash on some unallowed formats tags or characters.

Now: is there are way to ignore format instruction or character that can't be correctly interpreted or rendered during the conversion, in order that they wouldn't crash, or to check and clean up the block before conversion ?

The problem is we don't have any influence on the quality of the rawdata containing the html-blocks since they can't be copied from a third party tool as word into our database.

so the html tags should only be render if possible

The example shows that the html can contain any strange formatting including inline css

Any idea? Thank for a hint

Kagel

Hi Kagel,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for sharing the details.

Well, I analyze your file and the issue is that " "" is used instead of ending quotes in your file wherever the font is declared. You can fix this issue by simply replace the " "" with an ending quote ". Please see the following sample code in this regard:

// Instantiate an object PDF class

Aspose.Pdf.Generator.Pdf pdf = new Aspose.Pdf.Generator.Pdf();

// add the section to PDF document sections collection

Aspose.Pdf.Generator.Section section = pdf.Sections.Add();

// Read the contents of HTML file into StreamReader object

StreamReader r = File.OpenText(@"D:\AP Data\March2012\ugly-html-renamed\ugly-html-renamed.html");

//Create text paragraphs containing HTML text

Aspose.Pdf.Generator.Text text2 = new Aspose.Pdf.Generator.Text(section, r.ReadToEnd().Replace(""", "\""));

// enable the property to display HTML contents within their own formatting

text2.IsHtmlTagSupported = true;

//Add the text paragraphs containing HTML text to the section

section.Paragraphs.Add(text2);

//Save the pdf document

pdf.Save(@"D:\AP Data\March2012\ugly-html-renamed\ugly-html-renamed.pdf");

Please try this and do let us know if this fixes your issue.

Sorry for the inconvenience,