HTML to PDF Conversion with Javascript using C# hangs .NET host process indefinetly

I was trying to convert some HTML to PDF but the whole process got stuck forever during the document load phase (on the constructor).

This is the code to reproduce the issue with the nuget package (.NET 4.8):

using (var htmlStream = new System.IO.MemoryStream(System.Text.Encoding.UTF8.GetBytes(@"
		<div id=""jstext""></div>
		<script type=""text/javascript"">
			document.getElementById('jstext').innerHTML = '>JS TEXT<';
	new Aspose.Pdf.Document(htmlStream, new Aspose.Pdf.HtmlLoadOptions());

If I do encoding on the string like this '&gt;JS TEXT&lt;' it does work correctly.

To work around the issue I had to wrap the call in a Task.Factory.StartNew and impose a timeout on the wait call.

Since I don’t really care about javascript because I’m processing untrusted input, how can I disable/remove/skip loading of all scripts?

I didn’t find any relevant option in the HtmlLoadOptions class a part from the ResourceLoadingStrategy which is good to prevent network calls.
Something like the Aspose.Html.Sandbox would be useful.



You can use PdfJavaScriptStripper.Strip method to remove Java Script from the document. However, it also throws exception for your case.

We have logged this problem in our issue tracking system as PDFNET-51140. You will be notified via this forum thread once this issue is resolved. We apologize for your inconvenience.

is it there any update on this issue?
I would like to use your library to convert untrusted files but I cannot do it because of this problem.
At the moment I have to use Pechkin just for this task.



We really regret to inform you that the earlier logged ticket was not resolved due to other pending issues in the queue. Nevertheless, we have recorded your concerns and will surely inform you as soon as we make some progress towards its resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.