Getting a Table of Contents from a Document and storing the Sections and Headers into an XML file

Hello,

Apologies for the double post, but I couldn’t get my other account working since it was on work email.

I am trying to get a Table of Contents from a document that begins on page 8, and spans 14 more pages to page 22.

I need to get the Table of Contents (presumably as a node?) and then extract the Section headings (the ToC has Section headings, which have headings underneath that correspond to a Table name within that section).

I would like to store the Sections and then the corresponding Headings (Table names) into an XML file or using some other way, so that I can reconstruct a new document using the same order at a later point.

I can’t find anything to do with ‘Getting ToC from Document’, or otherwise reading an entire ToC and storing it in a variable.

The ToC format is as follows:
Section 1
someTableName in section 1
someTableName in section 1
someTableName in section 1
someTableName in section 1
Section 2
someTableName in section 2
someTableName in section 2
someTableName in section 2

Etcetera.
Any help is greatly appreciated.

@casperf11,

I think, you can meet this requirement after using the following code:

Document doc = new Document("D:\\Temp\\toc.docx");

foreach (Field field in doc.Range.Fields)
{
    if (field.Type.Equals(Aspose.Words.Fields.FieldType.FieldHyperlink))
    {
        FieldHyperlink hyperlink = (FieldHyperlink)field;
        if (hyperlink.SubAddress != null && hyperlink.SubAddress.StartsWith("_Toc"))
        {
            Paragraph tocItem = (Paragraph)field.Start.GetAncestor(NodeType.Paragraph);
            Console.WriteLine(tocItem.ToString(SaveFormat.Text).Trim());
            Console.WriteLine("------------------");
            if (tocItem != null)
            {
                Bookmark bm = doc.Range.Bookmarks[hyperlink.SubAddress];
                // Get the location this TOC Item is pointing to
                Paragraph pointer = (Paragraph)bm.BookmarkStart.GetAncestor(NodeType.Paragraph);
                Console.WriteLine(pointer.ToString(SaveFormat.Text));
            }

            Console.WriteLine("|||||||||||||||||||||||||||||");
        }
    }
}
1 Like

This worked perfectly. I was struggling with this for a good while, thanks a lot. The quick response is also really appreciated.

@casperf11,

Thanks for your feedback. In case you have further inquiries or need any help, please let us know.

1 Like

Hey, I don’t know if I’m blind or not, but I couldn’t find anything similar to your solution on the documentation for working with Table of Contents to the extent that your answer explained it. Maybe it could be useful to add your answer to the documentation for other people in the future? :slight_smile:

@casperf11,

Thanks for your suggestions. We will be sure to improve online documentation in this regard. We apologize for any inconvenience.

1 Like

You guys are awesome.