Dear Team,
Can i get a workaround solution to extract part images with fig caption in java? or can i get an idea to extract part images.
Part images :: Images that do not contain legends(i.e… (a) , (b) ,( c ) ,…) below the images, but contains images one below or beside another images within single fig caption below
Attached sample :
part.zip (333.0 KB)
Thanks…
@jan.kathir
Please ZIP and attach your expected output Word documents here for our reference. We will then provide you more information about your query along with code example.
Hi
@tahir.manzoor
I have attached sample output below for the above input. Please find the attachment.
Attached Sample_output:Sample output.zip (347.0 KB)
@jan.kathir
We are working over your query and will share the code example with you soon.
@jan.kathir
Please use the following code example to extract the shapes from the document. Hope this helps you.
Document doc = new Document(MyDir + "part.docx");
doc.updateListLabels();
int i = 1;
ArrayList nodes = new ArrayList();
for (Paragraph paragraph : (Iterable<Paragraph>) doc.getChildNodes(NodeType.PARAGRAPH, true))
{
if(paragraph.toString(SaveFormat.TEXT).trim().startsWith("Fig"))
{
Node previousPara = paragraph.getPreviousSibling();
while (previousPara != null
&& previousPara.getNodeType() == NodeType.PARAGRAPH
&& previousPara.toString(SaveFormat.TEXT).trim().length() == 0)
{
if(previousPara != null)
nodes.add(previousPara);
previousPara = previousPara.getPreviousSibling();
}
if(nodes.size() > 0)
{
//Reverse the node collection.
Collections.reverse(nodes);
//Extract the consecutive shapes and export them into new document
Document dstDoc = new Document();
for (Paragraph para : (Iterable<Paragraph>)nodes)
{
NodeImporter importer = new NodeImporter(doc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
Node newNode = importer.importNode(para, true);
dstDoc.getFirstSection().getBody().appendChild(newNode);
}
//Remove the first empty paragraph
if(dstDoc.getFirstSection().getBody().getFirstParagraph().toString(SaveFormat.TEXT).trim().length() == 0)
dstDoc.getFirstSection().getBody().getFirstParagraph().remove();
dstDoc.save(MyDir + "output"+i+".docx");
i++;
nodes.clear();
}
}
}