Extraction Issue 9

Dear team,

we are extracting images from docx but below case we are notable to read figure caption, I found Figure captions in List please refer below input and provide source code to read figure caption

Input : manuscript-clean-R2-06-16.docx (2.3 MB)

@e503824 Captions in your document are list items with a special number format, you can get the caption paragraphs from your document using code like the following:

Document doc = new Document("C:\\Temp\\in.docx");

Iterable<Paragraph> paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);
for (Paragraph p : paragraphs)
{
    if (p.isListItem() && p.getListFormat().getListLevel().getNumberFormat().startsWith("Fig"))
    {
        System.out.println(p.toString(SaveFormat.TEXT).trim());
    }
}