We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

How to extract embedded objects(pdfs-docs....) inside excel

Hi Aspose,


We have a requirement to get extract all the embedded objects in a excel sheet. We required to get the name of the embedded objects and also extract the files.

For Ex: if there is a pdf file(pdfattach.pdf) embedded in the excel, we need to save the pdf file and get the name “pdfattach”.

Thanks in Advance,
Prabu J

Hi,


Thanks for your query.

Please see the document(s) for your reference:
http://www.aspose.com/docs/display/cellsnet/Managing+OLE+Objects#ManagingOLEObjects-ExtractingOLEObjectsintheWorkbook (.NET)
http://www.aspose.com/docs/display/cellsjava/Managing+OLE+Objects#ManagingOLEObjects-ExtractingOLEObjectsintheWorkbook (JAVA)

Thank you.

Hi ,


Thanks for your reply.

We are able to save the objects, but the problem for us is to get the “name” of the files.

In the example we are saving the files in a different name. We need to save the file in the same name showing in the XLS.

In the examplpe given in http://www.aspose.com/docs/display/cellsjava/Managing+OLE+Objects#ManagingOLEObjects-ExtractingOLEObjectsintheWorkbook (JAVA)

the file name seen in XLS are v6.pdf, datatest.xls…
But we are saving as OLE1,OLE2…

We would like to know is there a way to get the name/title of the OLE objects embedded in XLS.

Thanks and Regards,
Prabu

Hi,


Please try using OleObject.getObjectSourceFullName() method for your requirements.

Hope, this helps a bit.

Thank you.

Hi,


We have tried all the options given below. We still dint get the name of the file embedded. We are getting only NULL as value.

System.out.println(ob.getAlternativeText());
System.out.println(ob.getHtmlText());
System.out.println(ob.getImageSourceFullName());
System.out.println(ob.getInputRange());
System.out.println(ob.getLinkedCell());
System.out.println(ob.getMacroName());
System.out.println(ob.getName());
System.out.println(ob.getObjectSourceFullName());
System.out.println(ob.getProgID());
System.out.println(ob.getSourceFullName());
System.out.println(ob.getText());
System.out.println(ob.getTitle());
System.out.println(ob.getFileFormatType());
System.out.println(ob.getFileType());

Is there any other way to get the file name.

Thanks and Regards,
Prabu

Hi,


For your information, if the underlying OLE Object is not a linked object, the stored file name would not be same as per original file name. Anyways, could you provide your template Excel file containing Ole Objects, we will check it soon.

Thank you.

Hi,


I’m also encountering same issue on getting the original file name of embedded objects (word, excel and powerpoint) from excel. Please find the attached excel with embedded objects of various file types. And a screenshots of output of the extraction. The method getObjectSourceFullName() does return the original file name for word, excel, power point, pdf and bmp image. Please advise.

Thanks.

Hi Lee Chiaw Chin,


Thanks for the template file and screenshot.

Yes, you are right as I noticed by testing your scenario/ case. I got different file names for the embedded objects. But as I said earlier, since, your underlying OLE Object are not linked object, the stored file names may not be same as per original file names. Anyways, I will further check with the product team if this is an issue or expected behavior. We will get back to you soon.
e.g
Sample code:

// Instantiating a Workbook object
Workbook workbook = new Workbook(“XLSX-EMBEDDED.xlsx”);

// Get the OleObject Collection in the first worksheet.
OleObjectCollection oles = workbook.getWorksheets().get(0).getOleObjects();

// Loop through all the ole objects and extract each object. in the worksheet.
for (int i = 0; i < oles.getCount(); i++) {
if (oles.get(i).getMsoDrawingType() == MsoDrawingType.OLE_OBJECT) {
OleObject ole = (OleObject) oles.get(i);
System.out.println(ole.getSourceFullName());

}
}

Thank you.

Hi Amjad Sahi


Thanks for you reply. Would like to know if the product team has check that it is an issue or expected behavior?

Thanks

Hi,


See what we got the results via Aspose.Cells APIs:
Microsoft_Word_97_-_2003_Document1.doc
Microsoft_Word_Document1.docx
Microsoft_Excel_97-2003_Worksheet2.xls
Microsoft_Excel_Macro-Enabled_Worksheet2.xlsm
Microsoft_Excel_Worksheet3.xlsx
Microsoft_PowerPoint_97-2003_Presentation3.ppt
Microsoft_PowerPoint_Presentation4.pptx
C:\Users\lcchin\Desktop\New folder (4)\1\ERS-CS-20161016-MSG.msg
C:\Users\lcchin\Desktop\New folder (4)\1\ERS-CS-20161016-TXT.txt
oleObject3.bin
C:\Users\lcchin\Desktop\New folder (4)\1\ERS-CS-20161016-HTML.html

C:\Users\lcchin\Desktop\New folder (4)\2\ERS-CS-20161016-GIF.gif
C:\Users\lcchin\Desktop\New folder (4)\2\ERS-CS-20161016-JPEG.jpeg
C:\Users\lcchin\Desktop\New folder (4)\2\ERS-CS-20161016-JPG.jpg
C:\Users\lcchin\Desktop\New folder (4)\2\ERS-CS-20161016-PNG.png
C:\Users\lcchin\Desktop\New folder (4)\2\ERS-CS-20161016-TIF.tif
C:\Users\lcchin\Desktop\New folder (4)\2\ERS-CS-20161016-TIFF.tiff

Could you give us your expected results, we will check it soon.

Thank you.