Hi team,
We are using below method to identify if a paragraph belogs to TOC.
public boolean isNodeTypeTOC(Paragraph paragraph, DocumentData data) {
for (Field field : paragraph.getRange().getFields()) {
if (Objects.equals(field.getType(), FieldType.FIELD_TOC)) {
return true;
}
}
return false;
}
This is not working for all paragraphs of TOC. In attached document only first paragraph of TOC ( DEFINITIONS AND INTERPRETATION) is identified as TOC and rest are not identified at TOC.
test.docx (63.2 KB)
@SATISHSATYAEESH Paragraph in the TOC are formatted with TOC1…TOC9 styles. So you can use the following code to identify paragraphs in the TOC:
List<Integer> tocStyles = new ArrayList<Integer>();
tocStyles.add(StyleIdentifier.TOC_1);
tocStyles.add(StyleIdentifier.TOC_2);
tocStyles.add(StyleIdentifier.TOC_3);
tocStyles.add(StyleIdentifier.TOC_4);
tocStyles.add(StyleIdentifier.TOC_5);
tocStyles.add(StyleIdentifier.TOC_6);
tocStyles.add(StyleIdentifier.TOC_7);
tocStyles.add(StyleIdentifier.TOC_8);
tocStyles.add(StyleIdentifier.TOC_9);
Document doc = new Document("C:\\Temp\\in.docx");
for (Paragraph p : (Iterable<Paragraph>)doc.getChildNodes(NodeType.PARAGRAPH, true))
{
boolean isTocParagraph = tocStyles.contains(p.getParagraphFormat().getStyleIdentifier());
if (isTocParagraph)
System.out.println(p.toString(SaveFormat.TEXT).trim());
}