Empty shape imported

We are having issues with some of our documents while upgrading to Aspose.Words 17.7 for Java. In some cases, the imported document has shapes that seemingly do not exist in the Word file. Take the attached test case, a DOCX file saved in compatibility mode. Using 17.7, the following code renders 1 shape object, apparently right behind the text “(peak intensity and width)”

Document doc = new Document("/tmp/5484127_11_part.docx");
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
System.out.println("Shapes: " + shapes.getCount());

When opening the file in Word, I cannot see any image or shape. When using Aspose.Words 3.3 to import the file (the version we have used so far), the same code renders 0 shape objects, just as we think it should be.

Any ideas of what’s going wrong or am I missing something?
Thanks, Markus
docfile.zip (20.1 KB)

@sschepper

Thanks for your inquiry. Aspose.Words result is expected result, your document contains a shape. Please check document.xml of your DOCX file It includes a shape element. Please find attached document for your reference. Hopefully it will help.

Shape.png (31.1 KB)
5484127_11_part.zip (23.5 KB)

Hi Tilal,

thank you for your reply. You are right, there is a shape in the document.xml. However, it is an alternate content and neither choice nor fallback is visible in Word nor acessible in the Aspose.Words object model, so we would like to delete it after loading. Is there a way to detect shapes/nodes that came from alternate content?

Thanks, Markus

@sschepper

Thanks for your feedback. Please note the shape is accessible in Aspose.Words DOM Capture.PNG (44.7 KB)
. The shape is Text box and you can remove it as following.

Document doc = new Document("5484127_11_part.docx");
NodeCollection<Shape> shapes = doc.getChildNodes(NodeType.SHAPE, true);

for (Shape shape : shapes)
{
    if (shape.getShapeType() == ShapeType.TEXT_BOX)
    { 
    	System.out.println(shape.getText()); 
        shape.remove();
    }
}