HTML tags to Microsoft Word Fields

Hello guys,

I have an HTML file that has to be converted to a MS Word doc and I used the regular Aspose.Words API to do the same.

I noticed that only anchor tags are recognized and automatically converted to Microsoft Word fields (hyperlink fields in this case).

My question is, which other HTML tags are automatically recognized as fields? Do they have to contain any special attributes to be recognized as fields? If so, what are they? Please pass on documentation if its available.

Any help on this would be appreciated.

Thanks.

–Aswin Anand

Hi

Thanks for your request. Actually, HTML format is not a Word format and it does not support all features of Word formats natively. MS Word adds special tags to indicate whether an HTML element is some special element in MS Word during roundtrip DOC->HTML->DOC. This is MS Word “magic”, which it uses to roundtrip HTML to Word and vice versa. What kind of fields do you need in your Word document? Maybe you should just use DocumentBuilder to insert them where they are needed.
Best regards.

Hi Alexey,

Is it possible to make Aspose recognize certain tags as fields with custom aspose magic? For example, assuming I have a div tag on the page that contains the table of contents for the document. Is it possible to specially annotate this div in a certain way so that aspose can recognize it as a field?

Here’s an example:

<div class="toc" type="msword-field-toc">
    <ol>
        <li>Introduction</li>
        <li>Chapter 1</li>
    </ol>
</div>

Now, when this example is imported, Aspose should recognize that TOC div as a TOC field in MS Word. Is it possible to achieve the same in Aspose?

–Aswin Anand

Hi

Thanks for your request. Maybe, you can use paragraph with bookmark a placeholder, and then use DocuemntBuilder to insert field into this paragraph. For example, see the following code:

// Read HTML form file.
string html = File.ReadAllText(@"Test001\test.html");
// Create document and DocumentBuilder.
Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
// Insert HTml into the docuemnt.
builder.InsertHtml(html);
// Move document builder to the bookamrk.
builder.MoveToBookmark("tocBk");
// Insert TOC.
// note: you should update TOC in MS Word (Ctrl+A and then press F9)
builder.InsertTableOfContents("\\o \"1-3\" \\h \\z \\u");
// Save output document
doc.Save(@"Test001\out.doc");

here is html:

<html>
<body>
    <p><a name="tocBk"></a></p>
    <h1>This is heading 1</h1>
    <h2>This is heading 2</h2>
    <h3>This is heading 3</h3>
</body>
</html>

Highlighted is bookmark.
Hope this helps.
Best regards.

Thanks Alexy. I will implement your suggestion and get back if there’s any confusion.

–Aswin Anand