Hi Team,
My requirement is to extract the display images from the document and save into new document.please,kindly help me to solve the issue.
source document: 34.zip (397.3 KB)
expected output:expectedoutput.zip (65.5 KB)
Thanks & regards,
priyanga G
@priyanga,
Thanks for your inquiry. Please use the following code example to extract the image from the document and insert it into new document.
Document doc = new Document(MyDir + "34.doc");
DocumentBuilder builder = new DocumentBuilder(doc);
int i = 1;
for (Shape shape : (Iterable<Shape>) doc.getChildNodes(NodeType.SHAPE, true))
{
if(!shape.hasImage())
continue;
Document dstDoc = new Document();
NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
Node newNode = importer.importNode(shape, true);
dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
dstDoc.save(MyDir + "output"+i+".docx");
i++;
}
Hi @tahir.manzoor,
Thank you very much .
It’s working fine.
Other Issue: I want to extract the display images only.but in previous code extract the figure 1 with the display image.please let me know how to ignore the images with fig caption.
input: 34.zip (397.3 KB)
actual output:actual output.zip (344.0 KB)
expected output:expectedoutput.zip (293.7 KB)
Thanks & regards,
priyanga G
@priyanga,
Thanks for your inquiry. You can ignore the Fig caption using following code example. Hope this helps you.
Document doc = new Document(MyDir + "34.doc");
DocumentBuilder builder = new DocumentBuilder(doc);
int i = 1;
for (Shape shape : (Iterable<Shape>) doc.getChildNodes(NodeType.SHAPE, true))
{
if(!shape.hasImage())
continue;
if(shape.getParentParagraph().toString(SaveFormat.TEXT).trim().startsWith("Fig")
|| (shape.getParentParagraph().getNextSibling()!= null && shape.getParentParagraph().getNextSibling().toString(SaveFormat.TEXT).trim().startsWith("Fig"))
)
continue;
Document dstDoc = new Document();
NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
Node newNode = importer.importNode(shape, true);
dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
dstDoc.save(MyDir + "output"+i+".docx");
i++;
}
Hi @tahir.manzoor,
Thank you very much.
I am able to get exact output.
Thanks & regards,
Priyanga G