Get HTML of extracted text from word file

HI Aspose Team,

I am extracting some specific text from word file using Aspose.Word C# .
Now I have to get html of extracted text.

I don’t want to convert word into html and rework on it, Is there any method which i can use to get html of word file text.

How to do that.

@Somnath.kamthe

Summary
You can obtain HTML for only the part of a Word document that you have extracted, without converting the whole document. Aspose.Words for .NET allows you to render a single Node (e.g., a Paragraph or Run) or a collection of nodes (via a DocumentFragment or a temporary Document) directly to HTML.

Why it works
Node objects expose a ToString(SaveFormat) overload (or you can use Save on a DocumentFragment) that accepts SaveFormat.Html. When you call this overload the library generates the HTML markup for that node only, so you do not need to convert the entire source file.

What to do next

  1. Extract the required nodes – use your existing logic (Find, SelectNodes, Range, etc.) to obtain the Node objects that contain the text you need.

  2. Render a single node to HTML

using Aspose.Words;
using Aspose.Words.Saving;

// Assume 'node' is the extracted Paragraph/Run/etc.
Node node = /* your extracted node */;
string html = node.ToString(SaveFormat.Html);
// 'html' now contains something like <p>Your text</p>
  1. Render multiple nodes (if needed)
using Aspise.Words;
using System.IO;
using System.Text;

Document srcDoc = new Document("Source.docx");

// Create a fragment to hold the selected nodes
DocumentFragment fragment = new DocumentFragment(srcDoc);

foreach (Paragraph para in srcDoc.GetChildNodes(NodeType.Paragraph, true))
{
 if (para.GetText().Contains("keyword"))
 fragment.AppendChild(para.Clone(true));
}

HtmlSaveOptions options = new HtmlSaveOptions
{
 ExportImagesAsBase64 = true, // embed images if any
 ExportFontResources = true,
 PrettyFormat = true
};

using (MemoryStream ms = new MemoryStream())
{
 fragment.Save(ms, options);
 string html = Encoding.UTF8.GetString(ms.ToArray());
 // 'html' now contains the HTML for the selected paragraphs only
}
  1. Apply a license (if you have one) – without a valid license the output may contain evaluation watermarks.
var license = new License();
license.SetLicense("Aspose.Words.NET.lic");
  1. Use the HTML – write the html string to Excel, embed it in a web page, or process it further as required.

Note: The evidence retrieved does not contain explicit documentation for Node.ToString(SaveFormat.Html) or DocumentFragment.Save in C#. The approach described above follows the typical Aspose.Words API pattern and is consistent with the library’s capabilities, but you should verify the exact method signatures and availability in the version of Aspose.Words you are using by consulting the official Aspose.Words for .NET documentation.

:warning: I could not fully verify this explanation against the retrieved sources. Please double-check the behaviour in your environment and confirm the exact SDK/version.

SDK version unclear — refer to the official release notes before applying this guidance.

@Somnath.kamthe You can use Node.ToString method and pass SaveFormat.Html to get HTML of any node in the document or HTML of the whole document.

Thanks for your prompt response.

How to get html of Row of the table

@Somnath.kamthe The same way as for any other node in the document:

Document doc = new Document(@"C:\Temp\in.docx");

Table t = doc.FirstSection.Body.Tables[0];
Row r = t.FirstRow;

Console.WriteLine(r.ToString(SaveFormat.Html));

I Tried same solution but giving below object reference exception.
My document has one table and a row.

Row r object is not null, It is having value refer below screenshot.

Can you please provide solution for same

@Somnath.kamthe The expectation means there is either no table in your document’s first section or there are no rows in the table. It is hard to tell for sure without actual document.

Hello Alexey,
While running provided code I am facing exception of “object reference” even my document contains multiple tables and rows and it will always has tables and rows.
Attached document for same.
Is provided code by you working for below attached document.

EPO-13484NE0090001-01-2026.docx (49.1 KB)

@Somnath.kamthe
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-28860

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.