Trouble with Aspose DOM and with Shape Nodes

a79129 · August 23, 2013, 11:47am

Hi,
I’m having trouble trying to exclude some node types from a document. So just for testing, i tried to select all the node types listed in your library, and deleting all the ones that weren’t of those types. As shown in the following example:

Document doc = new Document("test.docx");
NodeCollection srcNodes = doc.GetChildNodes(NodeType.Any, true);
foreach (Node srcNode in srcNodes)
{
if ( srcNode.NodeType != NodeType.Document
|| srcNode.NodeType != NodeType.Section
|| srcNode.NodeType != NodeType.Body
…
|| srcNode.NodeType != NodeType.OfficeMath
)
srcNode.Remove();

}

What ends up happening here is that the output document is empty. Despite me including all the 32 node types. And I’ve no clue why it is so.

Second problem:

NodeCollection shapes = doc.GetChildNodes(NodeType.Shape, true); 
foreach (Aspose.Words.Drawing.Shape shape in shapes)
{ shape.Remove();
/*if (true)
{ 
shape.Remove();
}*/
}

This code doesn’t delete all the shapes in a document. Some images remain, I cannot figure out why that is.

Deep gratitude if you can look into these issues, thanks.

awais.hafeez · August 26, 2013, 4:31am

Hi,

Thanks for your inquiry. Perhaps you’re using an older version of Aspose.Words; I would suggest you please upgrade to the latest version of Aspose.Words i.e. 13.7.0. You can download it from the following link:
https://releases.aspose.com/words/net

Secondly, please read the 6th point “All Node Collections are now Live” in the following page:
https://docs.aspose.com/words/net/aspose-words-for-net-11-8-0-release-notes/

I hope, this helps.

Best regards,

a79129 · August 27, 2013, 4:53am

Hi Awais, thanks for the response.
I’m running Aspose.Words 13.7 so it shouldn’t be a version issue.
Now about the issue with the Shape nodes, I’m still struggling with this.
When I step into the running program the shapes.Count variable is actually 0. So the program can’t even find any shapes in the document. I tried to exclude everything else by making a blank word document with just one image inserted to it, but it can’t find it, never mind removing it.
Any idea why that is?
Regards

awais.hafeez · August 28, 2013, 12:28am

Hi,
Thanks for your inquiry. Newer Microsoft Word documents (such as DOCX) may contain a different type of image container called DrawingML. You can filter such images by using NodeType.DrawingML. I hope, this helps.
Best regards,

a79129 · August 28, 2013, 9:06am

Hi Awais,

Thanks, that DrawingML Node does the trick…

But still about my first inquiry(the first code example) I’m having problems. It seems to remove the entire document, despite me seemingly including all the 32 node types in the if clause.

Regards

awais.hafeez · August 29, 2013, 8:08am

Hi,
Thanks for your inquiry. The problem occurs because of incorrect usage of conditional OR operator. In this case, you put “srcNode.NodeType != NodeType.Document” at the top which causes the control to always enter inside the body of IF block for every node in “srcNodes” collection. Also, as the first node in “srcNodes” collection is of type Section, it removes the whole Section with all it’s child nodes. I would suggest you please take a look at the tree representation of Document in DOM hierarchy:
Document Tree Navigation
Best regards,