Find embedded word document in power point

Hi, I am using Aspose.Slide to find the embedded word document in the attached power point presentation. I can’t find it using Shape and OleObjectFrame. Can you advise me on where are these word documents located and how to save them as separate word documents. Thank you.

Hi James,


I have observed the requirements shared by you and suggest you to please try using the following sample code to extract the ole frames data from presentation and saving them to separate files. You can use and modify the following code as per your requirement and can change it to accommodate other file formats as well.

public static void ReadPPTOle(String file, string path)
{

//Load the PPT to PresentationEx object
Presentation pres = new Presentation(path + file);

//Access the first slide
Slide sld = pres.Slides[0];

Shape shape = null;
for (int i = 0; i < pres.Slides.Count; i++)
{
sld = pres.Slides[i];
for (int j = 0; j < sld.Shapes.Count; j++)
{
shape = sld.Shapes[j];
if (shape is OleObjectFrame)
{
//Cast the shape to OleObjectFrame
OleObjectFrame oof = (OleObjectFrame)shape;

Console.WriteLine(String.Format("%s\t\t\t\t%s", oof.ObjectName, oof.ObjectProgId));

String FileType = “”;
if (oof.ObjectName.Equals(“Worksheet”) && oof.ObjectProgId.Equals(“Excel.Sheet.12”))
{
FileType = “.xlsx”;
}
else if (oof.ObjectName.Equals(“Worksheet”) && oof.ObjectProgId.Equals(“Excel.Sheet.8”))
{
FileType = “.xls”;
}
else if (oof.ObjectName.Equals(“Document”) && oof.ObjectProgId.Equals(“Word.Document.12”))
{
FileType = “.docx”;
}
else if (oof.ObjectName.Equals(“Document”) && oof.ObjectProgId.Equals(“Word.Document.8”))
{
FileType = “.doc”;
}
else if (oof.ObjectName.Equals(“Presentation”) && oof.ObjectProgId.Equals(“PowerPoint.Show.12”))
{
FileType = “.ppsx”;
}
else if (oof.ObjectName.Equals(“Presentation”) && oof.ObjectProgId.Equals(“PowerPoint.Show.8”))
{
FileType = “.pps”;
}
else if (oof.ObjectName.Equals(“Acrobat Document”) && oof.ObjectProgId.Equals(“AcroExch.Document.7”))
{
FileType = “.pdf”;
}

else
{
FileType = “.txt”;

}
if (oof != null)
{

FileStream fstr = new FileStream(path + “Extracted_OLE_Slide_” + i.ToString() + " OleIndex_" + j.ToString() + FileType, FileMode.Create, FileAccess.Write);

byte[] buf = oof.ObjectData;

fstr.Write(buf, 0, buf.Length);
fstr.Position = 0;

fstr.Flush();

fstr.Close();

System.Console.WriteLine(“File created…”);


}
}
}
}

}
Many Thanks.

I have tried that on every slide but it didn’t find the word documents. There is a total of 6 word docs embedded. I used another text extraction software and a forensic software to verify the result. Is there a hidden area that can hold ole in ppt other than slides?

Hi James,


We are sorry for your inconvenience. Can you please share slide numbers and shape images that are Ole frame with Word Docs. Unfortunately, I have not been able to observe the Word Doc Ole frames in my scanning. Please share so that I may investigate it and help you further in this regard.

Many Thanks,

Hi Mudassir,


I have attached all of the embedded word documents on this post. Unfortunately, I do not know which slide it’s on. I trace all of the slides but can’t find it. However, it’s in there. If you open up the ppt in notepad, you can see the content of those word documents. Thank you for helping.


Hi James,


I have re-scanned the presentation on my end and have not been able to find the desired Ole frames. Plus, I am also not been able to observe the ole frame shapes in slide as well once it is opened in PowerPoint. Can you please share the version of PPT presentation as well with us as Aspose.Slides does not support the PPT files older than PowerPoint 2003 versions.

Many Thanks,

I think the version is 2003-2007. Not exactly sure. I was able to open it using aspose.slide. It seems that these embedded word documents are hidden somewhere. None of the slides have them I looked in the master slides too. Do you think there is anywhere else these might be hidden under? I can see the context of these word documents when I open the ppt using notepad. I used multiple text extraction software. These software was able to get the text of ppt along with all the word documents too.

Hi James,


I have created an issue with ID SLIDESNET-35202 in our issue tracking system to further investigate and resolve the issue of extracting ole data using Aspose.Slides. This thread has been linked with the issue so that you may be automatically notified once the issue will be resolved.

We are sorry for your inconvenience,

Hi James,

Our product team has investigated the issue on their end. There’re a number of MSGraph embedding, but no MSWord embedding. Could you please share the slide numbers and corresponding .doc files where by the corresponding embedding of MS Word can be found.

Many Thanks,

Hi James,

As requested earlier, our product team has investigated the issue on their
end. There’re a number of MSGraph embedding, but no MSWord embedding.
Could you please share the slide numbers and corresponding .doc files
where by the corresponding embedding of MS Word can be found or we may close the issue on our end.

Many Thanks,