Free Support Forum - aspose.com

Aspose.Words .docx - How to get images list

I have a .docx (see attached) that I placed one image in, set some alt text, then copied the image several times. None of these images show up when I use the following code:

NodeCollection shapes = document.GetChildNodes(NodeType.Shape, true);

This code does work for .doc. All I did was save the .doc as .docx and now the code does not work.

How do I get the images list with .docx?

Thanks,

Jeff Montgomery

Hi

Thanks for your request. Images in your document are DrawingML objects:

http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/aspose.words.drawing.drawingml.html

That is why they are not listed in Shapes collection. Since DOC format does not support DrawingML, all DrawingML objects are converted to regular Shapes when you save your DOCX document as DOC.

Best regards,

Where are the image properties of DrawingML? What I'm trying to do is to loop through all the shapes and replace placeholder images with final images using the same size and other properties as the existing image.

The way I am doing this is to capture all the image properties for each existing shape, create a new shape and assign it the same properties. I then delete the old image:

newImage = new Shape(document, ShapeType.Image);
newImage.ImageData.SetImage(imageBytes);

newImage.Height = currentShape.Height;
newImage.Width = currentShape.Width;
newImage.AlternativeText = altText;
newImage.ZOrder = currentShape.ZOrder;
newImage.WrapType = currentShape.WrapType;
newImage.WrapSide = currentShape.WrapSide;
newImage.AnchorLocked = currentShape.AnchorLocked;
newImage.HorizontalAlignment = currentShape.HorizontalAlignment;
newImage.VerticalAlignment = currentShape.VerticalAlignment;
newImage.BehindText = currentShape.BehindText;

currentShape.ParentParagraph.InsertBefore(newImage, currentShape);
currentShape.Remove();

Or, is there a better way to replace the image of an existing shape and still have it fill the same size, position and other properties?

Thanks,

Jeff

To clarify what I said: I figured out that for .doc, I can just do the following to replace an existing image with a new one:

currentShape.ImageData.SetImage(imageBytes);

This switches the image. My question above refers to how to do this when the shape is DrawingML.

-Jeff

Hi there,

Thanks for your inquiry. Currently there are no public members to access the properties of a DrawingML node. We will be adding access to these properties sometime shortly. I have linked your request to the appropriate issue, you will be informed as soon as it's resolved.

In the mean time you can still achieve what you are looking for by first saving your document in DOC format in memory, reloading it and using your code above. Please see the code below.

Document doc = new Document(dataDir + "Images.docx");<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

// Save document as DOC in memory

MemoryStream stream = new MemoryStream();

doc.Save(stream, SaveFormat.Doc);

// Reload document as DOC to extract images.

Document doc2 = new Document(stream);

NodeCollection shapes = doc2.GetChildNodes(NodeType.Shape, true)

Thanks,

Thanks, that's an interesting workaround I wouldn't have thought of. Will this cause any loss of .docx-only formatting or settings?

Jeff

Hi Jeff,

Thanks for your request. No, you will not lose any content or formatting upon converting document from DOCX to DOC. However, you should note that some features supported by DOCX format are not supported by DOC format. As in case of DrawingML, all DrawingMLs need to be converted to Shapes upon converting because DOC format does not support DrawingML. The same with Themes, DOC format does not support Themes, so formatting applied via Themes will be converted to direct formatting applied to text, when you convert DOCX to DOC.

Anyways, the output DOC file should look the same as the original DOCX. If you will find some inconsistency, this should be considered as bug. Please report us in case of any issues.

Best regards,

The loss of features is still a loss of data to some extent. I don't think users would like to have their themeing dissappear from their documents. This definitely presents a problem for our commerical application.

I have a request then. My reason for asking about this is we need to change the image inside the shape. Therefore, the functionality we'd most like to see is something similar to Aspose.Words ImageData.SetImage():

currentShape.ImageData.SetImage(imageBytes);

Then, we would not need to add a new shape and set its properties.

Thanks,

Jeff

Hi Jeff,

Thanks for your request. We already linked your request to the appropriate issue. We will let you know once Aspose.Words allows accessing data of DrawingML objects.

Best regards,

I have a Word document that I need to convert to a jpeg. The document contains a DrawingML image as the background and a TextBox shape in the foreground. When converting to the jpeg, the TextBox is sent behind the image. I converted the document to a .Doc as you stated above so that I may set the ZOrder index of the shapes; however, I get the same results. The code does recognize two shapes, Image and TextBox, and does set the ZOrder properly when rendering as a Word document. I have attached a sample document that uses a solid navy image as the background and a white textbox as the foreground. I am using Words 9.6.0.0, .NET framework 4.0. Below is the code I use to convert this document.

Aspose.Words.License wl = new Aspose.Words.License();
wl.SetLicense(HttpContext.Current.Server.MapPath(".") + \\Resources\\Aspose.Custom.lic.xml);
Aspose.Words.Document doc = new Aspose.Words.Document("c://Test.docx");
MemoryStream stream = new MemoryStream();
doc.Save(stream, Aspose.Words.SaveFormat.Doc);
// Reload document as DOC to extract images.
Aspose.Words.Document doc2 = new Aspose.Words.Document(stream);
Aspose.Words.NodeCollection shapes = doc2.GetChildNodes(Aspose.Words.NodeType.Shape, true);
foreach (Aspose.Words.Drawing.Shape shape in shapes)
{
if (shape.ShapeType == Aspose.Words.Drawing.ShapeType.TextBox)
{
shape.ZOrder = 2;
}
else
{
shape.ZOrder = 1;
}
}

MemoryStream ms = new MemoryStream();
doc2.Save(ms, Aspose.Words.SaveFormat.Jpeg);
Response.BinaryWrite(ms.ToArray());
Response.ContentType = "image/jpeg";

///The word document is generated correctly here.
/*
Aspose.Words.Saving.OoxmlSaveOptions so = new Aspose.Words.Saving.OoxmlSaveOptions();
so.SaveFormat = Aspose.Words.SaveFormat.Docx;
doc2.Save(Response, "Test.docx", Aspose.Words.ContentDisposition.Attachment, so);
Response.End();
*/

timg - That sounds like this other bug that I submitted: http://www.aspose.com/community/forums/thread/282886.aspx

When loading a .docx, Aspose.Words does not import all shapes in the correct Z order. Certain shapes are imported in the right order, while others are sent to the back. The linked post has a workaround, but it's to save as .doc, in which case you lose the 2007 features. If it's indded that bug, we will both have to wait for it to be fixed.

Sorry, that was for Excel, yours is for Word. But it sounds similar.

Hi Tim,

Thanks for your request. The problem occurs because Aspose.Words does not support “Through” text wrapping of shapes upon rendering. Your request has been linked to the appropriate issue. We will let you know once this type of wrapping is supported.

In meantime, as a workaround, you can put your background shape into the header of the document or just use “Behind Text” text wrapping instead of “Through”.

Best regards.

Alexey,

Thanks for the reply. I have added the following line to my code and it now works properly:

foreach (Aspose.Words.Drawing.Shape shape in shapes)
{
if (shape.ShapeType == Aspose.Words.Drawing.ShapeType.TextBox)
{
shape.ZOrder = 2;
}
else
{
shape.WrapType = Aspose.Words.Drawing.WrapType.None;
shape.ZOrder = 1;
}
}

Hi Tim,

It is perfect that you managed to resolve the problem. Please feel free to ask in case of any issues, we will be glad to assist you.

Best regards,

Alexey,

I have a new requirement to convert this into a PDF document. Adding the code below makes the textbox disappear. While launching the document in Word, the textbox is there, so I'm assuming it's behind the graphic again.

MemoryStream msPdf = new MemoryStream();
doc.Save(msPdf, Aspose.Words.SaveFormat.Pdf);
Response.ContentType = "application/pdf";
Response.BinaryWrite((byte[])msPdf.ToArray());
 

Hi

<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thanks for your request. Could you attach the document you are getting issue with? I will check the problem on my side and provide you more information.

Best regards,

Please note that I am using the same code with Andrey’s suggestion posted earlier in this thread on 2/8 and 2/9. The only difference is the additional code I added to convert the document to a pdf.

Hello.

Thank you for additional information. Unfortunately I was unable to reproduce your problem (Aspose.Words 9.7.0.0). I used the following code:

License license = new License();<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

license.SetLicense("Aspose.Words.lic");

Document doc = new Document("E:\\test2.docx");

// Save document as DOC in memory

MemoryStream stream = new MemoryStream();

doc.Save(stream, SaveFormat.Doc);

// Reload document as DOC to extract images.

Document doc2 = new Document(stream);

NodeCollection shapes = doc2.GetChildNodes(NodeType.Shape, true);

foreach (Aspose.Words.Drawing.Shape shape in shapes)

{

if (shape.ShapeType == Aspose.Words.Drawing.ShapeType.TextBox)

{

shape.ZOrder = 2;

}

else

{

shape.WrapType = Aspose.Words.Drawing.WrapType.None;

shape.ZOrder = 1;

}

}

MemoryStream msPdf = new MemoryStream();

doc2.Save(msPdf, SaveFormat.Pdf);

Response.ContentType = "application/pdf";

Response.BinaryWrite((byte[])msPdf.ToArray());

Response.End();

See the attached file test2.pdf.

Viktor

Thanks for the reply. What I realized is that I was converting the docx to doc so that I can set the ZOrder properly, but then I was saving that back to docx, which I would then convert to pdf. I thought the WrapType.None setting would remain when converting back to docx, allowing the ZOrder of the shapes to be set properly.

I wrote a new method with a MemoryStream return type that will take a docx, convert it to doc, then return the pdf stream converted from the doc file.

Thanks