Free Support Forum - aspose.com

How to get images from word\media folder of DOCX using Java

The following docx file contains an image. You can verify by extracting the docx as a zip and see the file image1.jpeg inside the media directory.
However, when I’m trying to fetch the image with
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true)
none of the shapes contains the image.
Could you please advise if there’s any other way to fetch the image?

image.zip (41.4 KB)

@Shiran1

In your document, there are two Shape in the header and footer. These two shapes do not contain the image. You can open the document in MS Word and check these shapes in the header and footer.

Thanks for the response @tahir.manzoor. But this still doesn’t answer my question.
As I said:
None of the shapes contains the image.
Could you please advise if there’s any other way to fetch the image?

@Shiran1

You can use Shape.getShapeRenderer() method to create and returns an object that can be used to render this shape into an image. This method returns ShapeRenderer object. Once you have this class object, you can use ShapeRenderer.save method to render the shape and saves into an image.

Following code example shows how to export shape to image. Hope this helps you.

Document doc = new Document(MyDir + "image.docx");
Shape shape = (Shape)doc.getChild(NodeType.SHAPE, 0, true);

ShapeRenderer renderer = shape.getShapeRenderer();
ImageSaveOptions options = new ImageSaveOptions(SaveFormat.PNG);
renderer.save(MyDir + "Shape.ShapeRenderer {shape.Name}.png", options);

This doesn’t help since this is not what I asked.
If you open the docx file as if it was a zip, you can see there is a jpeg file inside called image1.jpeg
My question is whether or not I can fetch this image with Aspose?
And if not, why?

@Shiran1

When an image is inserted into document, it is stored into the media folder. This information is stored into document.xml and header and footer .xml files. With Aspose.Words, you can read this information. In your case, image1.jpeg is not saved in any of these .xml files. So, it cannot be read by Aspose.Words. Moreover, please read following article.
How Word Files Store Images

Hope this answers your query. Please let us know if you have any more queries.

In the link you sent in Relationships section it says:
“As well as the document structure in Document.xml, there are also the links or relationships between the document and other files, such as themes, fonts or images . These links are stored in the “_rels” directory.”

and indeed there is a reference to this image in theme1.xml.rels file:
<Relationship Id=“rId1” Type=“http://schemas.openxmlformats.org/officeDocument/2006/relationships/image” Target="…/media/image1.jpeg"/>

So as we can see the XML structure of the document does refer to this image.
Why don’t Aspose.Words parse those files?

@Shiran1

Unfortunately, Aspose.Words does not provide API to extract images from word/media folder which are not available in header/footer or body of document. However, we have logged this feature request as WORDSNET-19846 in our issue tracking system. You will be notified via this forum thread once this feature is available.

We apologize for your inconvenience.

@Shiran1

The requested image is theme fill blip and Aspose.Words does not have public API to access it. This feature has been postponed and we will inform you via this forum thread once there is an update available on it. We apologize for your inconvenience.