<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />
Thank you for additional information. Here is simple code, which shows how to extract OLE objects from the document:
// Save output document.
// Open document.
Document doc = new Document("C:\\Temp\\in.doc");
// Get all shapes.
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
// Loop through all shapes.
for (int i = 0; i < shapes.getCount(); i++)
Shape shape = (Shape)shapes.get(i);
// Check if the current shape has OLE object
if (shape.getOleFormat() == null)
// Determine extenfion of the object.
// Let's use bin extension by default.
String extension = "bin";
extension = "doc";
extension = "xls";
// Save OLE object.
shape.getOleFormat().save(String.format("C:\\Temp\\out_%d.%s", i, extension));
Regarding Excel objects, I managed to reproduce the problem (output Excel files cannot be opened in Excel). Your request has been linked to the appropriate issue. You will be notified as soon as it is resolved.
2. I think, most of functionality will be added before February 2010.
3. Please see the following link to learn how to extract text from documents:
Also, you can create your own converter using Aspose.Words. The technique is described here:
Hope this helps.