Office 2007 Embedded OLE Objects

Hi,

when you

- use Office 2003 and the Office 2007 Compatibility Pack,
- create a Word or Powerpoint file and embed an Excel chart object into it
- save the file as a DOCX or PPTX file

then you receive a ZIP file with a DOCX or PPTX extension.

Within this ZIP file you can find a file called OLEOBJECT1.BIN, that contains the embedded Excel chart object as an OLE object.

I would like to change the data, that the Excel chart is based on. Therefore, I need to extract the Excel file from OLEOBJECT1.BIN, change the data and then rewrite the binary file.

Is this possible using Aspose.Cells or other Aspose products?

Do you know another way to achieve the goal?

Regards, Mario

Hi Mario,

Aspose are working to support Office2007 DOCX, PPTX and XLSX format. This feature will be supported in Q1 2007.

Currently you can try to use your own code to retrieve whole data from the OLEOBJECT1.BIN and pass it to Aspose.Cells to change and rewrite the binary file.

This sounds, as if it is possible to load the data from OLEOBJECT1.BIN into an Aspose.Cells object.

How can I do this?

And how can rewrite the binary file, without making it unusable?

Regards, Mario

You can load data of OLEOBJECT1.BIN into MemoryStream and then use Workbook.Open method to load it into Aspose.Cells objects.

After you change the data, just replace the BIN file with new data.

OLEOBJECT1.BIN is actually an xls file. You can change its name to book1.xls and then use MS Excel to directly open it.

Thank you. Very good.

I thought, these files are serialized OLE objects with an OLE header of an unknown size.

Is this documented somewhere? I searched the Open XML specification but could not find anything about these bin files.

Regards, Mario

Hi Mario,

I also don't find docs about it. But I find these files are just Excel files.

Following your advice, I have now managed to write some data to the oleobject1.bin file.

I used the code shown below.

Originally, the file contained a worksheet with a diagram object and a table worksheet with some data that the diagram is based on.

After the code completes, the diagram object no longer exists in the file! It seems like the code has deleted the diagram object but I cannot see, where / why this happens.

Do you have any clue, why the code shown below removes the diagram object from the first worksheet?

----------------

using (Stream excelStream = binFile.OpenRead ())

{

// if the Excel file could be opened

if (excelStream != null)

{

// read the file content into a MemoryStream

Byte[] bytes = new byte[excelStream.Length];

excelStream.Read(bytes, 0, (int)excelStream.Length);

sourceMemoryStream = new MemoryStream(bytes);

}

}

// write the data from the placeholder DataTable to

// the cell range, provided together with the placeholder

if (sourceMemoryStream != null)

{

Aspose.Cells.Workbook workbook = new Aspose.Cells.Workbook();

workbook.LoadData(sourceMemoryStream);

workbook.Worksheets[chartData.WorksheetName].Cells.ImportDataTable(

chartData.Data, false, 1, 1, false);

targetMemoryStream = workbook.SaveToStream();

}

// open the Excel file in write mode

using (Stream excelStream = binFile.OpenWrite (true))

{

if (targetMemoryStream != null)

{

// write the modified workbook to the Excel file

targetMemoryStream.WriteTo(excelStream);

}

}

LoadData method only loads data in Excel file and discards other information.

For your purpose, you should use Workbook.Open method:

workbook.Open(sourceMemoryStream, FileFormatType.Excel2003);

Thank you very much.

When I now open the Excel file, the diagram page still shows the old diagram, though the data has changed. Only after clicking on the diagram, it is being updated and then reflects the changed data.

How can I update the diagram snapshot programmatically to reflect the current data in the workbook?

Could you please post following files here?

1. Your DOCX or PPTX file.

2. The output DOCX or PPTX file after processed by Aspose.Cells.

Thank you.

I found no upload function on the Reply form.

Therefore, I have sent the two files to nanjing@aspose.com.

You are free to refer to these files in this thread or attach the files to your response, so that others can view it.

Thanks for your support,
Mario

Dear Laurence,

I have now learned that I have to replace an EMF file in the ppt/media folder. This EMF image file contains the snapshot of the Excel chart.

Is it somehow possible to retrieve an Excel diagram as an image using Aspose.Cells, so that I can replace the image stream in the EMF file?

Regards, Mario

Dear Mario,

You are right. The EMF image is the snapshot of the Excel chart.

Currently Aspose.Cells doesn't support to get a thumbnail image of a chart. However, we are working on this feature. Hopefully it will be available in Q1 2007.

OK. Thanks for your help.

My customer expects to get a Powerpoint file containing several charts every morning. We cannot force the users to double click on each chart image to update it. All charts must be up-to-date, when the file is being opened.

Therefore, I fear, that we cannot support Office 2003, even though Aspose.Cells offers all that is needed to edit Excel 2003 files. Thus, your product does not help at all, only because I am not able to create an image file from a chart.

If you see any other way, I appreciate your proposals.

In Reference to the discussion regarding embedded objects, you say "OLEOBJECT1.BIN is actually an xls file". If I unzip any Office 2007 file and find an OLEObject xx.bin in an embedded folder, is it always an xls file? Should I always use Workbood.Open method to load it. What if the embedded object is a .doc or .ppt file? or another type?

Hi,

Well, "OLEOBJECT1.BIN" is not always an xls file, it can be of any format. And, do you need to open the bin file(s) or else? For your info, we can only extract ole objects in the template file but cannot open them. Moreover, we are not very sure about your need, could you elaborate and explain it more what are your requirements actually, we will check if we can support it for your need.

BTW do you need to check the file types. For extracting ole objects based on their file types, you may check the following documentation topic for your reference if it suits your need:

http://www.aspose.com/documentation/file-format-components/aspose.cells-for-.net-and-java/managing-ole-objects.html

Thank you.

Hi,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for considering Aspose.

Also, to add to Amjad’s reply, you can use some custom coding to check the signature of the binary file to check whether it’s an Excel file or not (if that is your requirement). Please see the following code in this regard,

Sample Code:

private void Form1_Load(object sender, EventArgs e)

{

FileStream stream = new FileStream(@"e:\excels\oleObject1.bin", FileMode.Open);

bool valid = false;

int header = 0;

FileFormatType ft = GetFileType("oleObject1.bin", stream, out valid, out header);

if(valid&&ft.Equals(FileFormatType.Excel2003))

{

//open 2003 workbook;

}

//same way you can add the checks for other file formats too

}

//Function to check the file format by the signature

internal static FileFormatType GetFileType(string fileName, Stream stream, out bool isValid, out int header)

{

isValid = false;

FileFormatType type = FileFormatType.Default;

BinaryReader reader = new BinaryReader(stream);

ulong signature = reader.ReadUInt64();

header = (int)(signature & 0xFFFF);

stream.Seek(-8, SeekOrigin.Current);

if (signature == 0xe11ab1a1e011cfd0)

{

isValid = true;

type = FileFormatType.Excel2003;

}

else if ((signature & 0xFFFFFFFF) == 0x04034b50)

{

isValid = true;

type = FileFormatType.Excel2007Xlsx;

if (fileName != null)

{

string extension = Path.GetExtension(fileName);

if (extension != null)

{

switch (extension.ToLower())

{

case ".xlsx":

type = FileFormatType.Excel2007Xlsx;

break;

case ".xlsm":

type = FileFormatType.Excel2007Xlsm;

break;

case ".xltx":

type = FileFormatType.Excel2007Xltx;

break;

case ".xltm":

type = FileFormatType.Excel2007Xltm;

break;

}

}

}

}

else

{

if ((signature & 0xFF) == 0xEF)//Uicode Text

{

stream.Seek(3, SeekOrigin.Current);

}

char[] chs = reader.ReadChars(5);

if ((new string(chs)).ToLower().Equals("<?xml"))

{

isValid = true;

type = FileFormatType.SpreadsheetML;

}

stream.Seek(0, SeekOrigin.Begin);

}

return type;

}

Thank You & Best Regards,

Hi thanks for all the helpful support. In answer to your questions, I have embedded objects in several types of Office documents. I have figured out from all of your help, how to extract different types of documents embedded in an excel document using Aspose.Cells.OleObject. The only part that I have not figured out completely is extracting from pptx documents. It looks like OleObjects are not yet supported in Aspose.Slides.Pptx, only in Aspose.Slides. However, I am interested in the oleobjectxxx.bin files because, if I unzip a pptx, I will find an oleobject.bin file in the ppt/embedded folder. I think if I can extract the file type from this bin file, I can persist this file to the correct corresponding office type of file. For instance, if the oleobject.bin is a word document, if I save it as oleobjectxxx.doc, it looks like it opens as a word document.

So I think if I can find out how to decipher the ulong signature to determine the filetype, I should be able to use corresponding Aspose object to save it from the stream to a file.

I hope that explains the question you have. And if so, can you tell me how I extract out that filetype? Your code only indicates how to test for Excel signatures.

Thanks,

Karen Schmidt

Hi,

Well, I 'm afraid we cannot support your requirement as we only evaluate MS Excel files.

Sorry for any inconvenience.

Thank you.