Free Support Forum - aspose.com

Extracting embedded OLE Objects from a Power Point and Work Book

Can any one help me with the sample code for extracting embedded OLE Objects from a Power Point Presentation and Excel Work Book as well.

I am referring the below mentioned blog and i am not able to see .GetSlideByPosition

in my code i can see GetSlideById and i am using Aspose.Slides.NET 16.12.1

Please help

@vjlourdu84,

Thanks for contacting support.

Please follow the instructions specified over following link for information on extracting OLE objects from Excel workbook.

Furthermore, currently Aspose.Slides for .NET supports Updating OLE objects automatically using MS PowerPoint Add In and also Creating Excel Chart and Embedding it in Presentation as OLE Object. However concerning to extracting embedded OLE objects, we will get back to you soon.

@vjlourdu84,

Please note that above mentioned blog is quite old and the approach specified in it is obsolete. There have been many changes in API after 2014 and visit the following link for latest information about Presentation.GetSlideById Method. Furthermore, please note that we don not have a public API which will allow extracting embedded stream but it can be easily achieved using third-party open source library, such as OpenMcdf.

First of all, you need to reference this library. It’s available as NuGet package, and you can add it to your project using Package Manager Console:

Install-Package OpenMcdf or using “Manage Nuget Packages” tool (or just download the library and reference it directly). Then, the following sample code can be used to work with embedded document using Aspose.Cells (if embedded item is Excel object).

In case you encounter any issue, please share the input file, so that we can test the scenario in our environment.

[C#]

public static void ExtractOle()
{
    Presentation pres = new Presentation(@"Powerpoint_Excel_Dummy.pptx");
    foreach (var slide in pres.Slides)
    {
        Console.WriteLine("Slide #{0}: {1} shapes", slide.SlideNumber, slide.Shapes.Count());
        foreach (var shape in slide.Shapes.OrderBy((s) => s.Name))
        {
            Console.WriteLine("- {0} - {1}", shape.ToString(), shape.Name);
            if (shape is OleObjectFrame)
            {
                if ("Objekt 3".Equals(shape.Name))
                {
                    OleObjectFrame ole = shape as OleObjectFrame;
                    if ((ole != null) && ("Worksheet".Equals(ole.ObjectName)))
                    {
                        using (MemoryStream memoryStream = new MemoryStream(ole.ObjectData))
                        {
                            OpenMcdf.CompoundFile compoundFile = new CompoundFile(memoryStream);
                            OpenMcdf.CFStream stream = compoundFile.RootStorage.GetStream("Package");
                            byte[] packageData = stream.GetData();

                            using (MemoryStream packageDataStream = new MemoryStream(packageData))
                            {
                                Workbook wb = new Workbook(packageDataStream);
                                Worksheet ws = wb.Worksheets[0];
                                using (MemoryStream msOut = new MemoryStream())
                                {
                                    wb.Save(msOut, Aspose.Cells.SaveFormat.Xlsx);
                                }
                            }
                        }
                    }
                }
            }
        }
    }
    pres.Save(@"Test.pptx", Aspose.Slides.Export.SaveFormat.Pptx);
}

codewarior,
Thanks for the sample code and the blogs. I am using the below mentioned code to extract the OLE Object.I have couple of questions.

  1. is it possible to get the exact file name for the embedded OLE Object(Similar to shape.OleFormat.IconCaption in Aspose.Words) Check https://www.webnots.com/how-to-change-embedded-file-name-in-word-excel-and-office-documents/High level scenario on Data Quality.zip (954.9 KB)
    .
    2.Do I really need to have the switch case for finding the file extensions.(in Aspose.Words we have something like shape.OleFormat.SuggestedExtension )

  2. How to extract the linked OLE Objects.Will the same code work.
    Please help
    private void Extract(string SourceFile)
    {
    // Open the template file.
    Workbook workbook = new Workbook(SourceFile + “High level scenario on Data Quality.xlsx”);

         for (int z = 0; z < workbook.Worksheets.Count; z++)
         {
             // Get the OleObject Collection in the first worksheet.
             Aspose.Cells.Drawing.OleObjectCollection oles = workbook.Worksheets[z].OleObjects;
    
             // Loop through all the oleobjects and extract each object.
             // In the worksheet.
             for (int i = 0; i < oles.Count; i++)
             {
                 Aspose.Cells.Drawing.OleObject ole = oles[i];
    
                 // Specify the output filename.
                 string fileName = SourceFile + "ole_" + i + ".";
                 fileName = SourceFile + ole.ObjectSourceFullName + i + ".";
    
                 // Specify each file format based on the oleobject format type.
                 switch (ole.FileFormatType)
                 {
                     case FileFormatType.Doc:
                         fileName += "doc";
                         break;
                     case FileFormatType.Docx:
                         fileName += "docx";
                         break;
                     case FileFormatType.Xlsm:
                         fileName += "xlsm";
                         break;
                     case FileFormatType.CSV:
                         fileName += "csv";
                         break;
                     case FileFormatType.XPS:
                         fileName += "xps";
                         break;
                     case FileFormatType.Xlsx:
                         fileName += "xlsx";
                         break;
                     case FileFormatType.Ppt:
                         fileName += "ppt";
                         break;
                     case FileFormatType.Pptx:
                         fileName += "pptx";
                         break;
                     case FileFormatType.VSD:
                         fileName += "vsd";
                         break;
                     case FileFormatType.VSDX:
                         fileName += "vsdx";
                         break;
                     case FileFormatType.Pdf:
                         fileName += "Pdf";
                         break;
                     case FileFormatType.Unknown:
                         fileName += "Jpg";
                         break;
                     default:
                         //........
                         break;
                 }
                 // Save the oleobject as a new excel file if the object type is xls.
                 if (ole.FileFormatType == FileFormatType.Xlsx)
                 {
                     MemoryStream ms = new MemoryStream();
                     ms.Write(ole.ObjectData, 0, ole.ObjectData.Length);
                     Workbook oleBook = new Workbook(ms);
                     oleBook.Settings.IsHidden = false;
                     oleBook.Save(SourceFile + "Excel_File" + i + ".out.xlsx");
                 }
                 // Create the files based on the oleobject format types.
                 else
                 {
                     FileStream fs = File.Create(fileName);
                     fs.Write(ole.ObjectData, 0, ole.ObjectData.Length); fs.Close();
                 }
             }
         }
     }

@vjlourdu84,
Thank you for the inquiry. We are working over your query and will get back to you soon.

Best Regards,
Imran Rafique

@vjlourdu84,

Thanks for your patience.

We have further evaluated your sample code and template Excel file and as per our findings, you are following correct approach while extracting the Ole object from the Excel workbook.

  1. We also have got different file name (i.e., Microsoft_Word_Document1.docx) for the ole object because the underlying OLE Object may not be linked object or the stored file name (in the source .xml of the file) may not be same as per original file name (you want).
  2. We only have OleObject.ObjectSourceFullName attribute to get/set the source file name, so you need to use it and I am afraid there is no other alternative at the moment.

Furthermore, you can use the same code snippet for extracting all types of Ole Objects from the workbook. In case you still face any issue or you have any further query, please feel free to contact.

I already tried using OleObject.ObjectSourceFullName but still it is showing Microsoft_Word_Document1.docx.

underlying OLE Object may not be linked object or the stored file name (Can you please let me know the exact process of linking the document in Workbook) so that I can get the Icon name(Exact File name) from the Work Book.

Since I am displaying it in Web UI the application user expects the same user name that they have it Excel Workbook.

Please help,

@vjlourdu84,

Thanks for sharing the details.

We are working on creating the required code snippet and will get back to you soon.

@vjlourdu84,

Thanks for your patience.

We have further looked into above stated scenario and as per our findings, if you insert an OLE Object in MS Excel, MS Excel will automatically changes the file name as “Microsoft_Word_Document1” for MS Word documents. We have also checked your template file and found the embedded object name there is also “Microsoft_Word_Document1”, as it just simply returns the name stored in the source file (you may unzip the file or open your XLSX file into some zip tool and check the embedded object which is stored as “Microsoft_Word_Document1”). Therefore we can’t do anything to get the original name and I am afraid, it’s the behaviour of MS Excel.

If you still want that Aspose.Cells should return original source file name, you should add your OLE Object as a linked OLE object in the workbook. But still, if you have linked object in the workbook, you can only get the original file name but cannot get the content of OLE Object.

Thanks for the comprehension. In case of any further query, please feel free to contact.