I'm trying to parse a word doc, in particular extract figures. Figures in these documents are a set of Shapes on the drawing layer and InlineShapes on the text layer. They aren't grouped, or children of a particular canvas.
Before I started trying Aspose.Words, I was using the Microsoft Interop API to first turn the Inline shapes into Shapes (a conversion method is available on the InlineShape interface), then reason on the shape's location to determine what shapes make up a single figure. That's where I stopped with Interop. I need to be able to extract the set of shapes as a single SVG (preferebly), or bitmap. Word has not such functionality in the interop api.
Enter Aspose.Words. I've attempted to group the shapes under a single ShapeGroup object, but when I call GetShapeRenderer().Save(...) the output image files are blank. I added the individual Shapes to the ShapeGroup using AppendChild(), then set the Width and Height of the ShapeGroup to a boundary that encompasses all shapes.
What am I doing wrong?
BTW, these word docs are the result of PDF->DOCX conversion. I've done this because I find the word apis easier than the PDF ones. However, if the PDF API for Aspose is as easy to use/understand as the Aspose.Words Document Object Model, I may read the PDF directly. Thoughts on that?
I tried using .GetChildNodes(Aspose.Words.NodeType.Shape, true) to find all Shapes in the document or section, and was surprised to find I only got a small fraction of them. Trying the snapshot vs. live didn't seem to make a difference.