HTML to PDF Conversion - not working with DOM method

rpkelley · October 2, 2015, 11:39am

Hi there,

We have recently refactored our code to generate pdf’s from html using the preferred method found here in the forums:

 Aspose.Pdf.Document pdoc = new Aspose.Pdf.Document(new MemoryStream(data), new HtmlLoadOptions()
            {
                
                PageInfo = new PageInfo()
                {
                    Margin = new Aspose.Pdf.MarginInfo(0, 0, 0, 0)

                },
                WarningHandler = new IgnoreHtmlWarningHandler(),
                InputEncoding = "UTF-8"


            });

However, we receive bad html into our system, as part of content that is embedded in XML documents. This we do not get to control, this method works much better than the older pdf generator method, but it will stop processing when it encounters badHtml.

Is there a workaround or option that can be passed to a Document so that it ignore BadHtml similar to the HtmlInfo class?

Also, I have attached a dump of the bad html to the zip file, but like I said, we do not generate or get to have control over what is loaded into the system.

codewarior · October 5, 2015, 12:44am

Hi Ryan,

Thanks for using our API’s.

Currently you can ignore/determine corrupted PDF file when using Aspose.Pdf for .NET. However it does not support the feature to determine corrupted HTML document when trying to transform it into PDF format. However for the sake of correction, I have logged this requirement as PDFNEWNET-39474 in our issue tracking system. We will further look into the details of this requirement and will keep you posted on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

aspose.notifier · February 7, 2019, 4:46pm

The issues you have found earlier (filed as ) have been fixed in this update. This message was posted using BugNotificationTool from Downloads module by MuzammilKhan