How to use the Aspose API to save all Word images the same as done by saveas(..html..)

I am evaluating Aspose for Java. I need to use the API (getImagedata()?) to save all images in the Word doc; the same Images (i need to match the order) as created by saveas(..html..).

When I extract Images from my test Word document (by walking the node tree and calling getImageData())

I get a different number of images than the number of images created when converting the document to html (saveas(..html..).

I use NodeCollection nc = doc.getChildNodes(NodeType.ANY, true)

Then for each nc I check for either NodeType.SHAPE or NodeType.DrawingML (I get 26 Nodes that match). As a result, I get 16 Shapes (4 TextBoxes and 12 non-TextBoxes) and10 DrawingML nodes.

Of those Shapes, hasImage() is false fo every one; which is odd - even for TextBoxes I thought).For the 10 DrawingML nodes, hasImage() is true for every one (as I expected).

For any node where hasImage() was true I call getImageData() amd save iamge to a folder. The total result is I get 10 images.

I then convert the same Word document to HTML using saveas(..html..).

Surprisingly, I get 20 image files (twice what I expected) in the html images folder.

So, I get 10 images with Aspose getImageData(), but 20 are created from HTML convert.

Even counting the 4 TextBoxes as placeholders, I am still 6 images short.

BTW- My test doc has mixed images - one contains a diagram with a hand-drawn oval overlay.

One of the extra html extracted images (a shape, but with hasImage() is false?) is just that hand-drawn oval.

How do I use the Aspose API to "always" pull the same images that are created during HTML saveas()?

Thanks for any help!

Hi
Bernie,


Thank you for your inquiry. Could you please attach your input Word document, you want to extract images from, here for testing? I will investigate the issue on my side and provide you code snippet.

Best Regards,