Free Support Forum - aspose.com

OLEObject- extract text from embedded objects in Word

Hi,
My client has a need to extract all the text displayed in a Word document, regardless if it’s from an OLE object. Is the best method to save each object separately using the OLEFormat.SuggestedExtension, then extract each object’s text? I was hoping that I could get the displayed text from an xls file through the OLEFormat object.

On a related note, I’m not able to access the underlying excel file for an embedded chart in my Word document. How do I do this?

Attached is my sample file.

Thanks,
Kiima Forshey
Aspose.Words 9.5.0.0
VB.NET

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your inquiry. Content of embedded OLE objects is represented as image in the document. So there is no way to get displayed text. The only way to get text of the OLE object is extracting the object from the document and extraction text from it.

Best regards,

OK, that makes sense.

Any answer for my second question? How do I access the underlying spreadsheet for the embedded chart?

Thanks,
-Kiima

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your inquiry. You can use OleFormat.Save method to extract the underlying OLE object from the document:

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/aspose.words.drawing.oleformat.save_overloads.html

Please let me know if you need more assistance, I will be glad to help you.

Best regards.

Thanks for the reply, but I was referring to the missing OLE object.

Attached is a word document with an embedded chart. I should be able to access/save the underlying Excel file using the NodeCollection, but for the attached file the NodeCollection.Count = 0.
Here is my code. Please note that I’ve been able to access/save other OLE objects successfully.

Dim colShapes As Aspose.Words.NodeCollection = awdDoc.GetChildNodes(Aspose.Words.NodeType.Shape, True, False)
For Each objShape As Aspose.Words.Drawing.Shape In colShapes
If objShape.HasImage Then
Select Case objShape.ShapeType
Case Aspose.Words.Drawing.ShapeType.Image

Case Aspose.Words.Drawing.ShapeType.OleObject
If objShape.OleFormat IsNot Nothing Then
Dim strExtractedPath As String = “C:\BlackIceTemp” & intImage.ToString() & strOleExt objShape.OleFormat.Save(strExtractedPath)
End If
End If
End Select
ElseIf objShape.IsWordArt Then

End If
Next

Thanks,
-Kiima

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. What you are asking for is not actually OLE objects. This is the data of OOXML Diagrams and Charts. MS Word stores data as excel documents inside DOCX document. We will consider adding ability to access data of OOXML Diagrams and Charts. I linked your request to the appropriate issue. You will be notified as soon as it is resolved.

Best regards,

ok, thanks.

The issues you have found earlier (filed as WORDSNET-3958) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.