Reading a table of contents (in Java)

Hello,

I’m looking at Aspose.Word for Java right now. I’m looking for an example of reading a table of contents (TOC) from a Word document. Can you help me?

Thank you
Max

@maxwenzel,

I think, you can meet this requirement after using the following code:

Document doc = new Document("E:\\input.docx");

for (Field field : doc.getRange().getFields())
{
    if (field.getType() == (FieldType.FIELD_HYPERLINK))
    {
        FieldHyperlink hyperlink = (FieldHyperlink)field;
        if (hyperlink.getSubAddress() != null && hyperlink.getSubAddress().startsWith("_Toc"))
        {
            Paragraph tocItem = (Paragraph)field.getStart().getAncestor(NodeType.PARAGRAPH);
            System.out.println(tocItem.toString(SaveFormat.TEXT).trim());
            System.out.println("------------------");
            if (tocItem != null)
            {
                Bookmark bm = doc.getRange().getBookmarks().get(hyperlink.getSubAddress());
                // Get the location this TOC Item is pointing to
                Paragraph pointer = (Paragraph)bm.getBookmarkStart().getAncestor(NodeType.PARAGRAPH);
                System.out.println(pointer.toString(SaveFormat.TEXT));
            }
            System.out.println("|||||||||||||||||||||||||||||");
        }
    }
}

Hope, this helps.

Thank you for your hints. I’m one step further. The table of contents in my document is not a real ‘table of contents’, but only a table with a list of chapters.
My idea now is that I simply collect all the headings in the document and create a real table of contents myself.
Do you have an example of how I can get all the headings?

@maxwenzel,

You can build on the following code to achieve what you are looking for.

for (Paragraph para : (Iterable<Paragraph>)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
    if (para.getParagraphFormat().getStyleIdentifier() == StyleIdentifier.HEADING_1 ||
            para.getParagraphFormat().getStyleIdentifier() == StyleIdentifier.HEADING_2 ||
            para.getParagraphFormat().getStyleIdentifier() == StyleIdentifier.HEADING_3 /* and so on*/)
    {

        System.out.println(para.toString(SaveFormat.TEXT) + " <-- this is a heading para");
    }
}

A post was split to a new topic: Get Page Number as well as Content from TOC - Table of Contents

A post was split to a new topic: Remove underline and blue color for table of contents