Unable to extract image from an .ODT document

Hi,


We are using Aspose Words to extract images from an .ODT document.
Although the images’ icon is extracted actual image content is not.

Can you please guide?

Thanks!
Hi Sonu,

Thanks for your inquiry. Could you please attach your input Word document here for testing? I will investigate the issue on my side and provide you more information.

Best regards,

Hello,

Attaching the sample ODT documents.

Regards.

Hi Sonu,


Thanks for your inquiry. You can use the following code to extract images from Embedded Packages:
Document doc = new Document(“C:\Temp\Sample_embedded_image.odt”);
for (Shape shape : (Iterable<Shape>) doc.getChildNodes(NodeType.SHAPE, true)) {
if (shape.getOleFormat() != null) {
if (shape.getOleFormat().getProgId().equals(“Package”)) {
int i = 0;
shape.getOleFormat().save(“C:\temp\img” + i + “.jpg”);
i++;
}
}
}
I hope, this helps.

Best regards,

Thanks for reply.

The statement shape.getOleFormat() always return null and hence no images could be extracted from it.

Alternative way to extract would be required here.

Regards

Hi Sonu,


Thanks for your inquiry. I am working over your query and will get back to you as soon as possible.

Best regards,

Hi Sonu,


Thanks for your patience.

On further investigation we came to know that the drawing object in your ‘Test_ODT_Insert_Object.odt’ document is actually not getting preserved during open/save by the current version of Aspose.Words. To address this problem, we have already logged an issue (WORDSNET-7924) in our bug tracking system and the fix to this issue will be available in the next version of Aspose.Words i.e. 13.3.0. Your request has also been linked to this issue and you will be notified as soon as the new version of Aspose.Words is published.

Best regards,
Hi Sonu,
Sonu:

Thanks for reply.

The statement shape.getOleFormat() always return null and hence no images could be extracted from it.

Alternative way to extract would be required here.

Regards

I have received response from our development team i.e. extraction of picture from OLE object is not supported in current version of Aspose.Words either. I have logged a new feature request in our issue tracking system. The feature ID is WORDSNET-7969. Your thread has been linked to it and you will be notified as soon as it is supported. Sorry for the inconvenience.

Best regards,

When are you expecting this feature to be included in product ? Is there a release date ?


Thanks

Hi Sonu,


Thanks for your inquiry. Currently, this issue WORDSNET-7969 is pending for analysis and is in the queue. I am afraid, I can not provide you any reliable estimate at the moment. Once your issue is analyzed, we will then be able to provide you an estimate.I apologize for any inconvenience.

Best regards,

The issues you have found earlier (filed as WORDSNET-7924) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Thanks for the update!


The fix is good for Sample_embedded_image.odt which is an ODT file created from MsWord editor and has image embedded as an object.

However, the issue still persist for Test_ODT_Insert_Object.odt which is an ODT file created from Open Office editor and has image embedded as OLE object.

Kindly provide us with solution for the same.

Thanks.

P.S.: Attached herewith the two ODT files for reference.

Hi Sonu,


Thanks for your inquiry.

I was unable to reproduce this issue on my side using Aspose.Words for Java 13.3.0; the ‘opendocument.DrawDocument.1’ embedded object in your attached ‘Test_ODT_Insert_Object.odt’ is imported correctly into Aspose.Words’ DOM. I have attached the output ODT document, that is produced on my side using the following code snippet, here for your reference. Could you please double check if you’re using Aspose.Words 13.3.0 on your side?
Document doc = new Document(“c:\temp\Test_ODT_Insert_Object.odt”);
doc.save(“C:\Temp\out 13.3.0.odt”);
Best regards,

Hello,


The issue is with image extraction when ODT is created from Open Office and has image embedded as OLE object.

Sample document which exhibits this issue is Test_ODT_Insert_Object.odt and below is the code snippet which we use, where in shape.getOleFormat() always return null

Document doc = new Document(“C:\Temp\Sample_embedded_image.odt”);
for (Shape shape : (Iterable<Shape>) doc.getChildNodes(NodeType.SHAPE, true)) {
if (shape.getOleFormat() != null) {
if (shape.getOleFormat().getProgId().equals(“Package”)) {
int i = 0;
shape.getOleFormat().save(“C:\temp\img” + i + “.jpg”);
i++;
}
}
}

P.S.: Aspose.Words for Java 13.3.0 has this issue.


Hi Sonu,

Thanks for the additional information.

Extraction of picture from OLE object (WORDSNET-7969) is not yet supported in current version of Aspose.Words. You will be notified as soon as it is supported. Sorry for the inconvenience.

Best regards,

Hi Sonu,


Thanks for being patient. Regarding WORDSNET-7969, please clarify do you want to get content of OLE object or picture of OLE object because content of OLE object is not always an image?

Best regards,

We need the content of the OLE object.


The issue was encountered when an Image is embedded as OLE object in ODT, created by Open Office editor.

Hi Sonu,


Thanks for the details. We will inform you as soon as this issue is resolved. We apologize for any inconvenience.

Best regards,

The issues you have found earlier (filed as WORDSNET-7969) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.