Extracting images and figures from doc or docx file

Is it possible to extract images and figures within a doc or docs file using Aspose java API ? How ?

Hi,

Thanks for your inquiry. Yes, by using Aspose.Words for Java component you can extract all images / figures found in

the document. For more information, please visit the following link:
https://reference.aspose.com/words/java/com.aspose.words/shape/

I hope, this will help.

Best Regards,

I’m using Aspose.Words.jdk16. Here I’m getting compilation error in

FileFormatUtil.imageTypeToExtension(shape.getImageData().getImageType())

I changed it to

FileFormatUtil.extensionToSaveFormat(shape.getImageData().getImageType()+"")

But Still its neither extracting anything nor giving any error. Attaching files.

My bad… Its working.
But images are not having correct extensions.
From the documentation I found the following types

NO_IMAGE = 0
UNKNOWN = 1
EMF = 2
WMF = 3
PICT = 4
JPEG = 5 JPEG JFIF.
PNG = 6
BMP = 7

Does that mean that if the actual image is of type gif or ico or psd, it’ll not be saved in correct format?
Also is it possible to get the original names of the extracted images?

Hi,

Thanks for the additional information. Firstly when you load your DOCX into Aspose.Words object, all pictures are identified as DrawingML objects. Please see the code that I modified for you:

NodeCollection images = doc.getChildNodes(NodeType.DRAWING_ML, true, false);
int imageIndex = 0;
for (DrawingML image: (Iterable <DrawingML> ) images)
{
    if (image.hasImage())
    {
        String imageFileName = java.text.MessageFormat.format("Image.ExportImages.{0}.jpeg", imageIndex);
        image.getImageData().save("C:\\test\\" + imageFileName);
        imageIndex++;
    }
}

Secondly, in Java getImageType method returns int representing the type of the image. For more details please see the following link:
https://reference.aspose.com/words/java/com.aspose.words/imagedata/

Please let me know if I can be of any further assistance.

Best Regards,

NodeType.DRAWING_ML is giving compilation error for me. DRAWING_ML variable is not available in class NodeType.

Hi,

Thanks for your request. To overcome this problem, I would suggest you to please visit the following link for downloading and using the latest version of Aspose.Words i.e. 10.6.0:

https://releases.aspose.com/words/java

I hope, this will help.

Best Regards,