How to ignore <w:WordDocument>


we have the html file with tags :


Can these tags be ignored after loading html file via aspose.word ?
Instead, they are converted to a paragraph with content “Print”.

We use

var doc = new Aspose.Words.Document(inputHtmlFileStream, options);

to create doc, and get :

  Node type: Paragraph
  Contents: "Print"

I have attached the html file. (792 Bytes)

@AlekseyMuzoverov Currently there is no option to ignore this. However, I have logged an investigation task WORDSNET-24831 to check whether Aspose.Words should keep the current behavior and read <w:WordDocument> content or ignore it as MS Word does. We will keep you updated and let you knwo once we have more information for you.

@AlekseyMuzoverov We have completed the analysis. We are going to add an option to import HTML document like MS Word does. This feature request is logged as WORDSNET-10399. The task is currently postponed and is not yet scheduled for development, so there are no estimates at the moment.
The currently version of Aspose.Words follows the HTML standard in this case and the resulting document generated by Aspose.Words should contain the text that is rendered by browsers.