We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Is there any chance that Aspose.Words now supports JavaScript Execution when converting from HTML to DOCX/PDF?

Hello,

Thank you for letting us know that Aspose.HTML now supports JavaScript Execution. I have been testing it out, and while I see the JavaScript Execution is working, I am getting different output results from converting HTML to DOCX in Aspose.HTML than I am in Aspose.Words. The resulting DOCX in Aspose.Words is formatted correctly and does not add in Highlighting Annotations for highlighted text. That is the desired output. Is there any chance that Aspose.Words now supports JavaScript Execution when converting from HTML to DOCX/PDF?

Thanks!

@InfoEdGlobal

Your inquiry has been moved to the Aspose.Words category where you will assisted accordingly.

@InfoEdGlobal Currently, Aspose.Words mimics MS Word behavior, which does not support JavaScript execution.
Could you please attach sample document with JavaScript features you would like Aspose.Words to support? We will investigate the document and consider possibility of adding such feature n one of future versions.

Hello, I have attached a few documents and will try to explain the requirements.

In the TestHTMLTemplateJS.7z file, there is a TestHTMLTemplateJS.htm file. This is the file that ultimately needs to be converted to a DOCX while executing Javascript.

asposeWordsDocx.docx was created using Aspose.Words with just a basic open and save as DocX. The resulting DOCX is the desired output, however, the javascript within the HTML file is not executed. The code used is the following

Dim asposeWords As New Words.Document("C:\InfoEdSVN\Internal\Alpha15\TestHTMLTemplateJS.htm")
asposeWords.Save("C:\InfoEdSVN\Internal\Alpha15\asposeWordsDocx.docx")

asposeHtmldocx.docx was created using Aspose.HTML. Using this, javascript is executed, and the output looks ok, but the main issue is that the text is basically split out into objects. Notably, the highlighted text as turned into essentially a yellow rectangle / word shape. This does not happen with Aspose.Words. It is a requirement for a clients to remove the text (and the corresponding highlighting) and input their own text. It is very difficult to do this with the objects that get created instead of just plain text. The code to generate this docx is the following:

    Dim margin As New Html.Drawing.Margin(36, 45, 36, 45)
    Dim docRenderingOptions As New Html.Rendering.Doc.DocRenderingOptions
    docRenderingOptions.PageSetup.AnyPage = New Html.Drawing.Page(margin)
    Using device As New Html.Rendering.Doc.DocDevice(docRenderingOptions, "C:\InfoEdSVN\Internal\Alpha15\AsposeHtmldocx.docx")
        Using htmlrenderer As New Html.Rendering.HtmlRenderer
            Using htmlDocument As New Html.HTMLDocument("C:\InfoEdSVN\Internal\Alpha15\TestHTMLTemplateJS.htm")
                htmlrenderer.Render(device, htmlDocument)
            End Using
        End Using
    End Using

Please let me know how I can achieve the desired results. Again, I need the javascript to be executed from the html file, but I want the output to be what I get from Aspose.Words (actual text instead of objects and shapes).

Let me know if you have questions about the requirements.

Thank you,
Brett

TestHTMLTemplateJS.7z (4.1 KB)
asposeWordsDocx.docx (25.8 KB)
AsposeHtmldocx.docx (68.5 KB)

@InfoEdGlobal Thank you for additional information. I have logged the feature request as WORDSNET-24368. However, it is not likely we will support this feature in Aspose.Words itself. In your case you need to get rendered HTML (after executing JajaScript) and then load it into Aspose.Words DOM.
@asad.ali Please also log feature request in Aspose.HTML to provide HTML Renderer to allow getting rendered HTML from source HTML after executing JavaScript in it.

@InfoEdGlobal We’re not going to implement a JavaScript engine in Aspose.Words and HTML documents will be loaded with JavaScript support turned off. As a partial workaround, you can re-save documents with Aspose.HTML. For example, this approach allows to generate a document with styles modified by scripts and save that document for further processing by Aspose.Words:

static void Main(string[] args)
{
    HtmlToDocx(@"x:\source.html");

    // Re-save HTML using Aspose.HTML in order to execute JavaScript and save modifications.
    HTMLDocument htmlDocument = new HTMLDocument(@"X:\source.html");
    htmlDocument.Save(@"x:\processed.html");

    HtmlToDocx(@"x:\processed.html");
}

static void HtmlToDocx(string htmlFilePath)
{
    Document document = new Document(htmlFilePath);
    document.Save(Path.ChangeExtension(htmlFilePath, ".docx")); 
}

Thank you for the response, however, this does not appear to work. I used the TestHTMLTemplateJS.htm file that I sent previously in this thread. It is converted to Word fine, but javascript is not processed. It appears merely saving to a new html file using Aspose.HTML does not actually execute Javascript.

Thanks,
Brett

@InfoEdGlobal I will move the thread into Aspose.HTML forum. My colleagues will help you shortly.

@InfoEdGlobal

A ticket as HTMLNET-4049 has been logged in our issue tracking system to further investigate this case. We will further analyze your requirements in detail and let you know as soon as the ticket is resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

@InfoEdGlobal

Aspose.HTML supports JavaScript processing while opening, rendering, or saving documents. This document contains a script that changes the value of the “value” property of the “input” element and it executes at the time of the “onload” event. If you, after opening the document, accesse the property of the corresponding element, you will see that the script has been executed.

using var doc = new HTMLDocument("TestHTMLTemplateJS.htm");
var input = (HTMLInputElement)doc.GetElementById("javascriptTest");
var result = input.Value == "javascript works";

According to the HTML specification, for security reasons, setting this property does not change the value of the “value” attribute of the “input” element, so when saving this document, the result of the script is not visible. We can offer two solutions to this problem, you can change the script in the document so that it sets the attribute and not the property, for example like this:

function javascriptFunction() {
  document.getElementById('javascriptTest').setAttribute('value', 'javascript works');
}

In this case, after saving, the “value” attribute of the “input” element will contain the desired value. Or we can add an option that when the document is saved, the value of the “value” attribute of the “input” element will be taken based on the value of the “value” property.

Thank you for the reply. In order for this to work as our clients are expecting, and for it to work how it is currently working with ABCPDF, I believe we would need the second option.

I have another example, from a client, where they are using the following to add a space around any tildes (~) using the following

     <script type="text/javascript">
			document.body.innerHTML = document.body.innerHTML.replace(/~/g, ' ~ ');
    </script>

When debugging, I look at the AsposeHTML object that I use to open the HTML and this javascript has not executed. I am trying with the latest version of Aspose.HTML. I am able to get my “javascript works” example working using your setAttribute syntax, but what javascript would the client use to accomplish what they are trying to do above? (i.e. change “Attenuated~Replication” to “Attenuated ~ Replication” *note the spaces)

Thanks,
Brett

@InfoEdGlobal

We have recorded your feedback under the ticket and will surely let you know soon about our feedback after performing investigation.

@InfoEdGlobal

When processing this JavaScript, an error occurs related to the insertion of the element during text replacement. If this script, which replaces the ‘~’ character, is moved from ‘body’ to ‘head’ and run on the onload event, then it will work. We will fix this bug as part of the same task.

@InfoEdGlobal

We have added a new option “SerializeInputValue” with which you can get the desired behavior. An example of its use is shown in the following code snippet:

using HTMLDocument htmlDocument = new HTMLDocument(@"X:\source.html");
var options = new HTMLSaveOptions { SerializeInputValue = true };
htmlDocument.Save(@"x:\processed.html", options);

We also fixed a bug that occurred when processing the script for adding spaces around tildes (~). All of these changes will be part of the 22.11.0 release.

The issues you have found earlier (filed as HTMLNET-4049) have been fixed in this update. This message was posted using Bugs notification tool by avpavlysh