Extract shapes/images using paragraph node in JAVA

Hi team,
Requiring a solution to extract images using paragraph nodes in separate documents for each image with a filter like image caption starting with “Fig”

Thanking Youextract.zip (1.6 MB)extractprob.zip (1.5 MB)

Hi Priya,

Thanks for your inquiry. Please use following code example to export the shapes into new document. Hope this helps you.

Document doc = new Document(MyDir + "extractproblem.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
int i = 1;
NodeCollection paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);
for (Paragraph paragraph : (Iterable<Paragraph>) paragraphs)
{
    if(paragraph.toString(SaveFormat.TEXT).trim().startsWith("Figure")
            && paragraph.getPreviousSibling() != null
            &&  paragraph.getPreviousSibling().getNodeType() == NodeType.PARAGRAPH
            &&  ((Paragraph)paragraph.getPreviousSibling()).getChildNodes(NodeType.SHAPE, true).getCount() > 0)
    {
        Document dstDoc = new Document();
        NodeCollection shapes = ((Paragraph)paragraph.getPreviousSibling()).getChildNodes(NodeType.SHAPE, true);
        for (Shape shape : (Iterable<Shape>) shapes)
        {
            NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
            Node newNode = importer.importNode(shape, true);
            dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
            dstDoc.save(MyDir + "output"+i+".docx");
            i++;
        }
    }
}

Thank You

Regards
Priya Dharshini J P