Extract image using above figure caption

Dear Team,

We need to extract image from source document using image caption above in the extracted image.

I’ve attached my source code.
Code : code.zip (990 Bytes)

Input : tbj_13137.zip (1.1 MB)

Thank you.

@ssvel

Thanks for your inquiry. Please ZIP and attach your expected output documents here for our reference. We will then provide you more information about your query along with code.

@tahir.manzoor

Thanks for your quick replying.

Expected OP : tbj_13137.zip (1.1 MB)

Thank you.

@ssvel

Thanks for sharing the document. You have shared the input document again. Following code example shows how to get the images after “Fig” caption. Hope this helps you.

Document doc = new Document(MyDir + "tbj_13137.docx");
int i = 1;
NodeCollection paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);
for (Paragraph  paragraph : (Iterable<Paragraph>) paragraphs)
{
    if(paragraph.toString(SaveFormat.TEXT).trim().startsWith("Fig") && paragraph.getNextSibling() != null) {

        if(paragraph.getNextSibling().isComposite() && ((CompositeNode)paragraph.getNextSibling()).getChildNodes(NodeType.SHAPE, true).getCount() > 0)
        {
            Document dstDoc = new Document();
            NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
            Node importNode = importer.importNode(paragraph.getNextSibling(), true);
            dstDoc.getFirstSection().getBody().appendChild(importNode);
            dstDoc.save(MyDir + "output" + i + ".docx");
            i++;
        }

    }
}