I am trying to extract images from word documents. I am using code based on examples provided in the documentation for doing this task. The example code is as follows:
public void extractImagesToFiles() throws Exception
{
Document doc = new Document(getMyDir() + “Image.SampleImages.doc”);
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, <span class="code-keyword">true</span>);
<span class="code-object">int</span> imageIndex = 0;
<span class="code-keyword">for</span> (Shape shape : (Iterable<Shape>) shapes)
{
<span class="code-keyword">if</span> (shape.hasImage())
{
<span class="code-object">String</span> imageFileName = java.text.MessageFormat.format(
<span class="code-quote">"Image.ExportImages.{0} Out{1}"</span>, imageIndex, FileFormatUtil.imageTypeToExtension(shape.getImageData().getImageType()));
shape.getImageData().save(getMyDir() + imageFileName);
imageIndex++;
}
}
}
The problem is that the above code does not find all the images in the file.
The attached file contains header/footer images, as well as the seal image,
and none are reported using the method outlined above. Can you assist?
{
Document doc = new Document(getMyDir() + “Image.SampleImages.doc”);
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, <span class="code-keyword">true</span>);
<span class="code-object">int</span> imageIndex = 0;
<span class="code-keyword">for</span> (Shape shape : (Iterable<Shape>) shapes)
{
<span class="code-keyword">if</span> (shape.hasImage())
{
<span class="code-object">String</span> imageFileName = java.text.MessageFormat.format(
<span class="code-quote">"Image.ExportImages.{0} Out{1}"</span>, imageIndex, FileFormatUtil.imageTypeToExtension(shape.getImageData().getImageType()));
shape.getImageData().save(getMyDir() + imageFileName);
imageIndex++;
}
}
}
The problem is that the above code does not find all the images in the file.
The attached file contains header/footer images, as well as the seal image,
and none are reported using the method outlined above. Can you assist?