I’m having trouble finding sample code to do the following (the code I
did find in this forum did not work, some methods and properties
referred to by the code were not there, so I wonder if they were using
a different version of the library)
Save embedded attachments (OLE objects) as separate files
Create tiff (or png) image of a document
Save (semi-formatted) text of a document
My project is in Java, so I’d rather use the Java version, but from
looking at the documentation, the Java libraries are lagging behind, so
I may have to use the .net versions. Please advise…
Q1. What is the timeframe before the Java libraries catch up?
Q2. Are the .net versions more active–are they always be more up to date and have the latest bug fixes?
Could you please attach the document from which you need to extract OLE objects. I will check it on my side and provide you more information.
Aspose.Words for Java does not support converting documents to images yet. This feature will be available in the very end of this year or at the beginning of the next. I will notify you as soon as this feature is available.
It is not quite clear for what you mean. Could you please be more specific? What document format are you interested in?
Currently we are working on synchronizing .NET and Java versions of Aspose.Words. We will finish this work somewhere at the beginning of the next year.
Best regards.
You can use any Word document, where you insert a file or another excel into the word, and it shows as an icon. (I have attached a document as well).
Would most of the functionality be added to the Java version by, say, March 2010?
Just for a “extract text of the document” purposes, so we can index the text. It doesn’t have to be too formatted, but the paragraphs should be in order, maybe headers/footers added whenever a new header or footer is set in the document (not on each page since there is no concept of a page for the extracted text), and text of word-art/“textbox” items extracted.
I’m looking for sample code to do these things, so I can evaluate the library. If you have code, or have references in the manual where I can find it, please send it to me.
Thank you for additional information. Here is simple code, which shows how to extract OLE objects from the document:
// Save output document.
doc.Save(@"Test001\out.doc");
// Open document.
Document doc = new Document("C:\\Temp\\in.doc");
// Get all shapes.
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
// Loop through all shapes.
for (int i = 0; i < shapes.getCount(); i++)
{
Shape shape = (Shape)shapes.get(i);
// Check if the current shape has OLE object
if (shape.getOleFormat() == null)
continue;
// Determine extenfion of the object.
// Let's use bin extension by default.
String extension = "bin";
if (shape.getOleFormat().getProgId().equals("Word.Document.8"))
extension = "doc";
if (shape.getOleFormat().getProgId().equals("Excel.Sheet.8"))
extension = "xls";
// Save OLE object.
shape.getOleFormat().save(String.format("C:\\Temp\\out_%d.%s", i, extension));
}