Embedded Object Extraction from Document (such as MS Words, MS Excel, MS PowerPoint, PDF, Email (.msg))

We will need to develop a batch job to extract embedded objects in Document (such as MS Words, MS Excel, MS PowerPoint, PDF, Email (.msg)). There are scenarios whereby embedded object has another embedded object within it. Please confirm whether we can able to achieve the requirement using Aspose API.

Thank you in advance.

Hi Mahesh,

Thanks for contacting support.

In order to perform the conversion, you need to try using individual file format API.

Aspose.Email provides the capabilities to create as well as manipulate email message files. Aspose.Email also allows to extract regular/inline attachments. For objects embedded in message body, the API doesn’t have full capability to extract these and it is used in combination with OpenMCDF. For more information, please visit Extraction of Embedded Objects.

Aspose.Slides is capable of extracting the added OLE object from presentation slides. In case if extracted ole object is an Excel object, Aspose.Cells can be used to load the extracted OLE object byte array (Use stream to load Byte array). Now, if extracted OLE object has nested OLE object inside it, that too won’t be an issue as now Aspose.Cells will be extracting the OLE object. That extracted OLE object can further be loaded using respective API based on its type. For further details, please visit Changing an OLE Object data

Aspose.Cells provides the feature to create as well as manipulate existing MS Excel files. For more information, please visit Extracting OLE Objects in the Workbook.

Aspose.Words offers the capabilities to create as well as manipulate existing MS Word files. In order to accomplish your requirements, please try using following code snippet.
[C#]

Document doc = new Document("Sample.docx");
int i = 0;
// Get collection of shapes
NodeCollection shapes = doc.GetChildNodes(NodeType.Shape, true);
// Loop through all shapes
foreach (Aspose.Words.Drawing.Shape shape in shapes)
{
    if (shape.OleFormat != null)
    {
        shape.OleFormat.Save(String.Format("out_{0}.{1}", i++, shape.OleFormat.SuggestedExtension));
    }
   
}

Aspose.Pdf for .NET offers the feature to create as well as manipulate existing PDF files. It also provides the capabilities to Add as well as extract embedded attachments inside the PDF file. For more information, please visit Get Individual Attachment.

Should you have any further query, please feel free to contact.

Regards,
Nayyer Shahbaz