Extracting MS Word file from OLE Object (C# .NET)

@rausch

Can you please elaborate your query in more details. Also please share comparison screenshot with us so that we may further investigate to help you out.

I used your code, unfortunately to many debug code and testing messed up some tests. Finally I managed to update the PowerPoint with a valid Word Object! Already made the code to generate my own preview image, means finally I can implement the feature to load, modify and save embedded Words inside PowerPoint.
Took a while :slight_smile: but nevertheless thanks for your support.

Sidenote: Not part of Aspose but as you recommended OpenMCDF >> I just created a new template PowerPoint file and tested with this. Now I get

	$exception	{"Invalid OLE structured storage file"}	OpenMcdf.CFFileFormatException

Checked github.com repo about OpenMCDF, used latest NuGet version 2.2.1.3 but now newly created PowerPoint where I embeddd Word files get this error now.

It would be far more easy and robust if you just made it possible to use:

ole.ObjectData = word.WordStream.ToArray();

To set it directly. Note: i do this with aspose cells as often said and there it works pretty fine.

Here is my latest PowerPoint test that fails to work.

powerpoint.pptx.zip (78.6 KB)

@rausch ,

I have observed your last post and like to share that Aspose.Cells and Aspose.Slides are two different APIs internally and cannot be compared. We appreciate that you have been able to extract required information based on information provided. We have already shared the mechanism that you can adopt using Aspose.Slide for extraction.

Hello,
they are probably two different API but handling OLE objects in office is more or less the same in PowerPoint and Excel, or not?! That is why I compared the functionality as cell provide direct usage without OpenMCDF and I would be very happy if I didn’t face the problem that openMcdf now throws exceptions for newest office documents… looks like Microsoft changed something inside…

@rausch,

I humbly disagree here. These are two completely APIs internally and manged by separate teams. I agree with your point of view that apparently the are part of Office Suite. I am not sure about any change being done on MS Office. However, if you encounter any issue, please feel free to share with us.

There is nothing to disagree, I just asked about it :).
I understand that there are two teams, just wondering that cells teams seems to manage this without and third party component.
However before I spend hours of time can you check if you can handle my last posted PowerPoint file as it won’t load in my test system using your posted sample code.

@rausch,

I have observed your above comments. Can you please be kind enough to share the sample code or project that you have used to reproduce the issue using new presentation that you have shared in your former post. This way we will be on same page and I may log the issue based on code sample reproducing it.

Hello, I used exactly the same code you kindly provided. With my “old” PowerPoint I shared in first place it works. With a newly created PowerPoint (latest Office 365) version I get an error from OpenMcdf like mentioned before.

@rausch,

We have worked with sample presentation shared by you. Actually you can’t open this OLE embedded data via OpenMcdf, because this data is not a Microsoft Compound File, this is a plain DOCX. This can be checked using ole.EmbeddedFileExtension property. OleObjectFrame ObjectData can contain different types of data (compound streams, zip-files, Word or Excel documents, etc.). In this case, you can use it directly from OleObjectFrame.ObjectData - if EmbeddedFileExtension is “.docx” then there’s no need to proceed any further actions using CompoundFile.

Please try using following sample project.

TestOLEExtract-modified.zip (100.0 KB)

Hello, thank you, I will check your code. I did not even know that there is a difference.
In both files I did the same > embedded a word file using PowerPoint to create the template. I do not understand when it gets a compound file and when it will be plain word.

@rausch,

I have observed your comments and like to share that I am not sure how you have added the file in PowerPoint. However, as I said earlier in my earlier post regarding types of OLE Object data. You can use the sample project shared on our end to extract the data by verifying from ole.EmbeddedFileExtension property.

Hello,
I am pretty sure in both files, with direct word and compound file, I used to do it the same way. Nevertheless your hint about file extension helped to solve the issue so now I can open, modify and save word back to powerpoint regardless the format.
Thanks for your help and patience!
Bye Yves

@rausch,

Thank you for your understanding too and its good things are finally resolved.