Could interpret whole html path?

Hi,

We try to upload the following html files path to GWT. Just wondering if Aspose.Html
could interpret it so that we could construct the html in server RPC to GWT?

File paths are used when linking to external files like:

Web pages
Images
Style sheets
JavaScripts

Ex:

down vote
favorite
The whole html package is such as:

/css
/EARoot/EA1.htm
/files
/images
/js
blank.html
index.html
toc.html

Any code could open “index.html”, then could get all DOM, div, etc in the index page, and also know the other pages from “href” in index.htm and so on…

Many thanks!

Ruhong

@ruhongcai

Thank you for contacting support.

We would like to share with you that Aspose APIs does not have any platform-related limitations. So you can manipulate or render any HTML file.

In case you face any problem, please share a narrowed down sample application reproducing the issue so that we may try to reproduce and investigate the problem in our environment.

Hi,

Thanks for reply.

Here is what I need to: Read in whole html files path and construct pure HTML files to upload to GWT and do navigation .

Given the whole html files path (Please see the attached file) which contains “html” files, CSS, javascript

(1)AsposeHTML could read in whole information through some ways such as only “index.htm” or
each html file , how about “js” and “CSS” ?

(2)Construct again all html files without js, (CSS will be inside HTML) for example
index.html has three buttons “Demo 1”, “Demo 2”, “Intractive demo” , click , go to other page,
the new html files will put other page with button and “href” in html, get rid of JS.

Please provide some sample code… Many thanks!
CircularNavigation.zip (67.9 KB)

Ruhong

@ruhongcai

We are looking into your query and will get back to you with our findings, soon. In case some information is required from you, we will be requesting it accordingly.

Many thanks!

Ruhong

@ruhongcai

Thank you for being patient.

We would like to share with you that Aspose.HTML API does not support creating a pure HTML file by reading whole information from index.htm file. However, a ticket with ID HTMLNET-1140 has been logged in our issue management system for further investigation and resolution. The issue ID has been linked with this thread so that you will receive notification as soon as the issue is resolved.

We are sorry for the inconvenience.

@ruhongcai

Thank you for being patient.

We would like to share with you that the ticket HTMLNET-1140 has been resolved and now you can save all resources inside HTML files and also remove all JavaScript.

To receive HTML file without JavaScript you need to disable scripts execution, or they will be executed during parsing.

Aspose.Html.Configuration config = new Aspose.Html.Configuration();
config.Security |= Aspose.Html.Sandbox.Scripts;

Then you can configure resources handling behavior with new Aspose.Html.Saving.HTMLSaveOptions class.

Aspose.Html.Saving.HTMLSaveOptions saveOptions = new 
Aspose.Html.Saving.HTMLSaveOptions();
//This will disable depth filter, which will lead to saving of all referenced resources.
saveOptions.ResourceHandlingOptions.MaxHandlingDepth = -1;
//To handle only local resources user can set according URL restriction.
saveOptions.ResourceHandlingOptions.UrlRestriction = Aspose.Html.Saving.UrlRestriction.SameHost;
//This option is used to embed resources in to HTML files.
saveOptions.ResourceHandlingOptions.Default = Aspose.Html.Saving.ResourceHandling.Embed;
//This option is used to discard all JS.
saveOptions.ResourceHandlingOptions.JavaScript = Aspose.Html.Saving.ResourceHandling.Discard;

As a result you will receive three HTML files which contain all local resources embedded in to them, except JavaScript that was discarded.

Complete usage example will look like this:

using (Aspose.Html.Configuration config = new Aspose.Html.Configuration())
{
config.Security |= Aspose.Html.Sandbox.Scripts;
using (Aspose.Html.HTMLDocument doc = new Aspose.Html.HTMLDocument(@"C:\inputFolder\index.html", config))
{
    Aspose.Html.Saving.HTMLSaveOptions saveOptions = new Aspose.Html.Saving.HTMLSaveOptions();
    //This will disable depth filter, which will lead to saving of all referenced resources.
    saveOptions.ResourceHandlingOptions.MaxHandlingDepth = -1;
    //To handle only local resources user can set according URL restriction.
    saveOptions.ResourceHandlingOptions.UrlRestriction = Aspose.Html.Saving.UrlRestriction.SameHost;
    //This option is used to embed resources in to HTML files.
    saveOptions.ResourceHandlingOptions.Default = Aspose.Html.Saving.ResourceHandling.Embed;
    //This option is used to discard all JS.
    saveOptions.ResourceHandlingOptions.JavaScript = Aspose.Html.Saving.ResourceHandling.Discard;
    doc.Save(@"C:\outputFolder\index.html", saveOptions);
}
}

Also you can save all HTML files as a single MHTML file by using Aspose.Html.Saving.MHTMLSaveOptions class.

using (Aspose.Html.Configuration config = new Aspose.Html.Configuration())
{
config.Security |= Aspose.Html.Sandbox.Scripts;
using (Aspose.Html.HTMLDocument doc = new Aspose.Html.HTMLDocument(@"C:\inputFolder\index.html", config))
{
    Aspose.Html.Saving.MHTMLSaveOptions saveOptions = new Aspose.Html.Saving.MHTMLSaveOptions();
    //This will disable depth filter, which will lead to saving of all referenced resources.
    saveOptions.ResourceHandlingOptions.MaxHandlingDepth = -1;
    //To handle only local resources user can set according URL restriction.
    saveOptions.ResourceHandlingOptions.UrlRestriction = Aspose.Html.Saving.UrlRestriction.SameHost;
    //This option is used to embed resources in to HTML files.
    saveOptions.ResourceHandlingOptions.Default = Aspose.Html.Saving.ResourceHandling.Embed;
    //This option is used to discard all JS.
    saveOptions.ResourceHandlingOptions.JavaScript = Aspose.Html.Saving.ResourceHandling.Discard;
    doc.Save(@"C:\outputFolder\index.mht", saveOptions);
}
}

As a result of this code, you will receive single MHTML file which will contain three HTML pages. We hope this will be helpful. Please feel free to contact us if you need any further assistance.

@ruhongcai

Thank you for being patient.

We would like to share with you that the ticket HTMLNET-1140 has been resolved and now you can save all resources inside HTML files and also remove all JavaScript.

To receive HTML file without JavaScript you need to disable scripts execution, or they will be executed during parsing.

Aspose.Html.Configuration config = new Aspose.Html.Configuration();
config.Security |= Aspose.Html.Sandbox.Scripts;

Then you can configure resources handling behavior with new Aspose.Html.Saving.HTMLSaveOptions class.

Aspose.Html.Saving.HTMLSaveOptions saveOptions = new 
Aspose.Html.Saving.HTMLSaveOptions();
//This will disable depth filter, which will lead to saving of all referenced resources.
saveOptions.ResourceHandlingOptions.MaxHandlingDepth = -1;
//To handle only local resources user can set according URL restriction.
saveOptions.ResourceHandlingOptions.UrlRestriction = Aspose.Html.Saving.UrlRestriction.SameHost;
//This option is used to embed resources in to HTML files.
saveOptions.ResourceHandlingOptions.Default = Aspose.Html.Saving.ResourceHandling.Embed;
//This option is used to discard all JS.
saveOptions.ResourceHandlingOptions.JavaScript = Aspose.Html.Saving.ResourceHandling.Discard;

As a result you will receive three HTML files which contain all local resources embedded in to them, except JavaScript that was discarded.

Complete usage example will look like this:

using (Aspose.Html.Configuration config = new Aspose.Html.Configuration())
{
config.Security |= Aspose.Html.Sandbox.Scripts;
using (Aspose.Html.HTMLDocument doc = new Aspose.Html.HTMLDocument(@"C:\inputFolder\index.html", config))
{
    Aspose.Html.Saving.HTMLSaveOptions saveOptions = new Aspose.Html.Saving.HTMLSaveOptions();
    //This will disable depth filter, which will lead to saving of all referenced resources.
    saveOptions.ResourceHandlingOptions.MaxHandlingDepth = -1;
    //To handle only local resources user can set according URL restriction.
    saveOptions.ResourceHandlingOptions.UrlRestriction = Aspose.Html.Saving.UrlRestriction.SameHost;
    //This option is used to embed resources in to HTML files.
    saveOptions.ResourceHandlingOptions.Default = Aspose.Html.Saving.ResourceHandling.Embed;
    //This option is used to discard all JS.
    saveOptions.ResourceHandlingOptions.JavaScript = Aspose.Html.Saving.ResourceHandling.Discard;
    doc.Save(@"C:\outputFolder\index.html", saveOptions);
}
}

Also you can save all HTML files as a single MHTML file by using Aspose.Html.Saving.MHTMLSaveOptions class.

using (Aspose.Html.Configuration config = new Aspose.Html.Configuration())
{
config.Security |= Aspose.Html.Sandbox.Scripts;
using (Aspose.Html.HTMLDocument doc = new Aspose.Html.HTMLDocument(@"C:\inputFolder\index.html", config))
{
    Aspose.Html.Saving.MHTMLSaveOptions saveOptions = new Aspose.Html.Saving.MHTMLSaveOptions();
    //This will disable depth filter, which will lead to saving of all referenced resources.
    saveOptions.ResourceHandlingOptions.MaxHandlingDepth = -1;
    //To handle only local resources user can set according URL restriction.
    saveOptions.ResourceHandlingOptions.UrlRestriction = Aspose.Html.Saving.UrlRestriction.SameHost;
    //This option is used to embed resources in to HTML files.
    saveOptions.ResourceHandlingOptions.Default = Aspose.Html.Saving.ResourceHandling.Embed;
    //This option is used to discard all JS.
    saveOptions.ResourceHandlingOptions.JavaScript = Aspose.Html.Saving.ResourceHandling.Discard;
    doc.Save(@"C:\outputFolder\index.mht", saveOptions);
}
}

As a result of this code, you will receive single MHTML file which will contain three HTML pages. We hope this will be helpful. Please feel free to contact us if you need any further assistance.