Hi Team,
I am able to extracting the labeled images next sibling like (a),(b).but I want to get the fig caption for those Labeled images.for example fig caption1.having (a),(b),© three images.The images are extracted as
separately.please let me know how to extract the whole figcaption1
,()I have attached the code
DocumentBuilder builder = new DocumentBuilder(interimdoc);
i = 1;
NodeCollection shapes = interimdoc.getChildNodes(NodeType.SHAPE, true);
for (Shape shape : (Iterable<Shape>)shapes)
{
if (shape.hasChart() || shape.hasImage())
{
Paragraph paragraph = shape.getParentParagraph();
//
Node node = shape.getParentParagraph().getNextSibling();
//Modify this condition according to your requirement
if (node != null && node.getNodeType() == NodeType.PARAGRAPH
&& (
((Paragraph)node).isListItem() || node.toString(SaveFormat.TEXT).contains("Figure")
|| node.toString(SaveFormat.TEXT).contains("(a)")
|| node.toString(SaveFormat.TEXT).contains("(b)")
|| node.toString(SaveFormat.TEXT).contains("(c)")
))
{
Document dstDoc = new Document();
NodeImporter importer = new NodeImporter(interimdoc, dstDoc, ImportFormatMode.KEEP_SOURCE_FORMATTING);
Node newNode = importer.importNode(shape, true);
dstDoc.getFirstSection().getBody().getFirstParagraph().appendChild(newNode);
if (dstDoc.getFirstSection().getBody().getFirstParagraph().toString(SaveFormat.TEXT).trim()
.length() == 0)
dstDoc.getFirstSection().getBody().getFirstParagraph().remove();
/** OUTPUT FILENAME START **/
String Imgcaption = paragraph.toString(SaveFormat.TEXT);
int k = 0;
while (k < Imgcaption.length() && !Character.isDigit(Imgcaption.charAt(k)))
k++;
int j = k;
while (j < Imgcaption.length() && Character.isDigit(Imgcaption.charAt(j)))
j++;
int l = Integer.parseInt(Imgcaption.substring(k, j));
// int l = Integer.parseInt(Imgcaption);
strI = Integer.toString(l);
Pattern pattern = Pattern.compile(strI);
Matcher matcher = pattern.matcher(Imgcaption);
while (matcher.find())
{
name = Imgcaption.substring(0, matcher.end());
name = name.replace(".", "_");
}
if (name.startsWith("Fig"))
{
name = "Fig" + "_" + l;
}
/** OUTPUT FILENAME END **/
filename = filefoldername + page + "_" + "Fig_" + i + "_" + "Fig_label" + name + ".docx";
dstDoc.save(filename);
i++;
}
}
}
}catch (NumberFormatException e){
//something went wrong
e.printStackTrace();
}
I am awaiting for your quick reply.
Many thanks in advance.
Thanks & Regards,
priyanga G