Issue on extract images with anchor

Hi Team,

My requirement is to extract the images which is followed with figure caption.

Extract the image and save into new document .but in some images are extracted with text .please help me to delete the images extract with text or extract the image without any text.

source document: TestForText (2).zip (35.4 KB)

actual output: sample.zip (33.1 KB)

Expected OutPut: ExpectedOutput.zip (33.8 KB)

Thanks & regards,
Priyanga G

@priyanga,

Thanks for your inquiry. In your case, we suggest you please remove the Run nodes as shown below before saving the document. Hope this helps you.

dstDoc.getChildNodes(NodeType.RUN, true).clear();
dstDoc.save(MyDir + "out.docx");

Hi Team,

Thank you very much .

It’s working fine.

But some other document having issue to extract the label images and part images .

we have able to extract the images with caption is working.please kindly help me to extract rest of images from the document which is apart from the extracted images like figure 1 (label images part images as unnumbered images) and save into each new document.Help me to extract all images.

source document: Test.zip (1.5 MB)

actual output: actual output.zip (1.3 MB)

Thanks & regards,
priyanga G

@priyanga,

Thanks for your inquiry. Please open your input Word document (TestForText (2).docx) using MS Word and set the text wrapping of image as “In Line with Text”. You will notice that the Fig caption is not under the image. In this case, we suggest you please use the following code example to extract the Shape node. Hope this helps you.

Document doc = new Document(MyDir + "TestForText (2).docx");

int i = 1;

NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
for (Shape shape : (Iterable<Shape>) shapes)
{
    Document dstDoc = new Document();

    NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
    Node newNode = importer.importNode(shape, true);
    dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
    dstDoc.save(MyDir + "output"+i+".docx");
    i++;
}