Hello,
I was trying to convert some HTML to PDF but the whole process got stuck forever during the document load phase (on the constructor).
This is the code to reproduce the issue with the 21.12.0.0 nuget package (.NET 4.8):
using (var htmlStream = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes(@"
<html>
<head></head>
<body>
<div id=""jstext""></div>
<script type=""text/javascript"">
document.getElementById('jstext').innerHTML = '>JS TEXT<';
</script>
</body>
</html>
")))
{
new Aspose.Pdf.Document(htmlStream, new Aspose.Pdf.HtmlLoadOptions());
}
If I do encoding on the string like this '>JS TEXT<'
it does work correctly.
To work around the issue I had to wrap the call in a Task.Factory.StartNew
and impose a timeout on the wait call.
Since I don’t really care about javascript because I’m processing untrusted input, how can I disable/remove/skip loading of all scripts?
I didn’t find any relevant option in the HtmlLoadOptions
class a part from the ResourceLoadingStrategy
which is good to prevent network calls.
Something like the Aspose.Html.Sandbox
would be useful.
Thanks,
SM