How to get figure caption

Hi Team,

how to get figure caption of particular image.I am used the following comment for get figure caption which The figure caption (text after the image) is inside Paragraph node.

String Imgcaption = paragraph.toString(SaveFormat.TEXT);

But in some cases we have label part like a) ,b)…then next to the label part then only have the figure caption.

Please, kindly help me to how to get the figure caption for this label cases

source : sample.zip (170.4 KB)

Thanks & regards,
Priyanga G

@priyanga,

Thanks for your inquiry. Please use following code example to get the desired output.

Document doc = new Document(MyDir + "sample.docx");
doc.updateListLabels();
Paragraph paragraph = (Paragraph)doc.getChild(NodeType.PARAGRAPH, 2, true);
if(paragraph.isListItem())
    System.out.println(paragraph.toString(SaveFormat.TEXT));

Hi @tahir.manzoor,

Thanks you very much.

still ,I am facing the same issue for the following sample.

Please,kindly help me to solve this issue.

Sample1:sample1.zip (186.4 KB)

Thanks & Regards,
Priyanga G

@priyanga,

Thanks for your inquiry. You can get the Fig caption using Paragraph.toString(SaveFormat.TEXT) method. You can check the text if it is started with “Fig” using String.startsWith method.

Hi @tahir.manzoor,

Thanks for your feedback.

I am not able to use the String.startsWith method .

Please,kindly help me to get figure caption for images followed with legend

Input: sample1.zip (186.4 KB)

Thanks & Regards,
Priyanga G

@priyanga,

Thanks for your inquiry. Please use the following code example to get the Fig caption followed by legend.

Document doc = new Document(MyDir + "in.docx");

NodeCollection paragraphs = doc.getChildNodes(NodeType.PARAGRAPH, true);
for (Paragraph  paragraph : (Iterable<Paragraph>) paragraphs) {
    if (paragraph.toString(SaveFormat.TEXT).trim().startsWith("Fig")) {
        Node PreviousPara = paragraph.getPreviousSibling();

        while(PreviousPara != null && PreviousPara.getNodeType() == NodeType.PARAGRAPH
                && PreviousPara.toString(SaveFormat.TEXT).trim().length() == 0 &&
                ((Paragraph)PreviousPara).getChildNodes(NodeType.SHAPE, true).getCount() == 0)
        {
            PreviousPara = PreviousPara.getPreviousSibling();
        }
        
        if (PreviousPara.toString(SaveFormat.TEXT).trim().contains("(a)") ||
                PreviousPara.toString(SaveFormat.TEXT).trim().contains("(b)") ||
                PreviousPara.toString(SaveFormat.TEXT).trim().contains("(b)") ||
                PreviousPara.toString(SaveFormat.TEXT).trim().contains("(d)"))
        {
            System.out.println(paragraph.toString(SaveFormat.TEXT));
        }
    }
}

Hi @tahir.manzoor,

Thanks for your reply.It’s working fine.

Please ,kindly help me to extract the images followed with legends along with to get the figure caption for extracted images.

input:sample1.zip (186.4 KB)

Thanks & Regards,
Priyanga G

@priyanga,

Thanks for your inquiry. We already shared the code example in your other thread. Please check the code example shared here:
Extraction of image from word document

Regarding extracting the Fig caption, please check my previous post.