HTML to PDF conversion hangs

We have some real world emails that cause our system to hang when attempting to open them using Aspose.Pdf. We have reduced the issue down to a very small test case which will hopefully help.


We realise the HTML in question is horribly invalid but we have seen a few in the wild that follow exhibit pattern. If the HTML itself cannot be supported we would much prefer it if the library just failed fast rather than hanging, and basically causing a DDoS for our service.

using (var htmlms = new MemoryStream(Encoding.UTF8.GetBytes(html)))
{
var htmlOptions = new Aspose.Pdf.HtmlLoadOptions();
Console.WriteLine(“Opening…”);
using (var pdf = new Aspose.Pdf.Document(htmlms, htmlOptions))
{
Console.WriteLine(“Open…”);
Console.WriteLine(“Done”);
}
}

where the variable ‘html’ is the html in the attached file. You will notice reaching ‘open…’ takes a very very long time (if it ever finishes…were not 100% sure).

Thanks,
Martyn

Hi Martyn,

Thanks for your inquiry. I have tested your scenario with shared HTML using Aspose.Pdf for .NET 10.9.0 and managed to observe the reported issue. For further investigation, I have logged an issue in our issue tracking system as PDFNEWNET-39637 and also linked your request to it. We will keep you updated via this thread regarding the issue status.

We are sorry for the inconvenience caused.

Best Regards,

The issues you have found earlier (filed as PDFNET-39637) have been fixed in Aspose.PDF for .NET 19.6.