Dear team,
we are extracting images from docx but below case we are notable to read figure caption, I found Figure captions in List please refer below input and provide source code to read figure caption
Input : manuscript-clean-R2-06-16.docx (2.3 MB)
@e503824 Captions in your document are list items with a special number format, you can get the caption paragraphs from your document using code like the following:
Document doc = new Document("C:\\Temp\\in.docx");
Iterable<Paragraph> paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);
for (Paragraph p : paragraphs)
{
if (p.isListItem() && p.getListFormat().getListLevel().getNumberFormat().startsWith("Fig"))
{
System.out.println(p.toString(SaveFormat.TEXT).trim());
}
}