Embedded Objects in Aspose.Word FOR JAVA

Hi all,
I need the functionality to extract any embedded object from the word document using Java. I saw in previous posts that this was possible using C#. However our solution is based upon java.
Is this functionality available there as well… If so could you please share some pieces of code to do so? Are there any constraints on this functionality? Eg it only extracts OLE-2 embedded objecsts, it does not work for the new docx format etc.
Many thanks for the information
Patrick Vanbrabant

Hi
Thanks for your request. IN DOCX document, there can be two kinds of Embedded objects:

  1. OLE objects. Aspose.Words supports such objects, so you can extract them from document.
  2. “Embedded Packages”. This is new way of embedding objects into MS Word document, which was introduced in DOCX. Currently, Aspose.Words does not support extracting such objects.

Your request has been linked to the appropriate issue. You will be notified as soon as it is supported.
Inserting new OLE objects into Word documents and updating existing OLE objects is not supported at the moment. Inserting an OLE object usually requires the host application and probably cannot be done by Aspose.Words.
Here is simple code, which shows how to extract OLE objects from the document:

// Open document.
Document doc = new Document("C:\\Temp\\in.doc");
// Get all shapes.
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
// Loop through all shapes.
for (int i = 0; i <shapes.getCount(); i++)
{
    Shape shape = (Shape) shapes.get(i);
    // Check if the current shape has OLE object
    if (shape.getOleFormat() == null)
        continue;
    // Determine extenfion of the object.
    // Let's use bin extension by default.
    String extension = "bin";
    if (shape.getOleFormat().getProgId().equals("Word.Document.8"))
        extension = "doc";
    if (shape.getOleFormat().getProgId().equals("Excel.Sheet.8"))
        extension = "xls";
    // Save OLE object.
    shape.getOleFormat().save(String.format("C:\\Temp\\out_%d.%s", i, extension));
}

Best regards,

Thanks a lot for your quick reply.
Is there a way to know which are the supported possible values for getOleFormat().getProgId().
In our word documents we often have pdf, powerpoint, xls, word documents and zip files embedded.
Is the name of the embedded document maintained?
Regards
Patrick

Hi
Thanks for your inquiry. There is no such list. The ProgID is stored in document binary as string. We just extract it from there. Here are examples of possible values of ProgId:
MS_ClipArt_Gallery
Equation.3
WPGraphic21
MIDFile
MSGraph.Chart.8
PBrush
MSPhotoEd.3
Excel.Sheet.8
WordPad.Document.1
etc…
In additional, there is no way to get original name of an embedded document. Exposing original filename is definitely not possible because it is not stored in a document.
Best regards,

The issues you have found earlier (filed as 16730) have been fixed in this update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.