I am evaluating Aspose for Java. I need to use the API (getImagedata()?) to save all images in the Word doc; the same Images (i need to match the order) as created by saveas(..html..).
When I extract Images from my test Word document (by walking the node tree and calling getImageData())
I get a different number of images than the number of images created when converting the document to html (saveas(..html..).
I use NodeCollection nc = doc.getChildNodes(NodeType.ANY, true)
Then for each nc I check for either NodeType.SHAPE or NodeType.DrawingML (I get 26 Nodes that match). As a result, I get 16 Shapes (4 TextBoxes and 12 non-TextBoxes) and10 DrawingML nodes.
Of those Shapes, hasImage() is false fo every one; which is odd - even for TextBoxes I thought).For the 10 DrawingML nodes, hasImage() is true for every one (as I expected).
For any node where hasImage() was true I call getImageData() amd save iamge to a folder. The total result is I get 10 images.
I then convert the same Word document to HTML using saveas(..html..).
Surprisingly, I get 20 image files (twice what I expected) in the html images folder.
So, I get 10 images with Aspose getImageData(), but 20 are created from HTML convert.
Even counting the 4 TextBoxes as placeholders, I am still 6 images short.
BTW- My test doc has mixed images - one contains a diagram with a hand-drawn oval overlay.
One of the extra html extracted images (a shape, but with hasImage() is false?) is just that hand-drawn oval.
How do I use the Aspose API to "always" pull the same images that are created during HTML saveas()?
Thanks for any help!