I want to extract text from a html file which may be a XHTML .
I have attached a sample file.
Can I get help on how to extract text from this file type ?
Test_html_file.html.zip (691 Bytes)
I want to extract text from a html file which may be a XHTML .
I have attached a sample file.
Can I get help on how to extract text from this file type ?
Test_html_file.html.zip (691 Bytes)
If you want to save the document to TXT file format, please use the following code example.
Document doc = new Document(MyDir + "Test_html_file.html");
doc.Save(MyDir + "21.1.txt");
If you want to extract some specific content from the document, please read the following article.
How to Extract Selected Content Between Nodes in a Document
If you still face problem, please share some more detail about your query along with expected output document. We will then provide you more information on it.