Extract content from bookmark throws exception using Java

dineshvme · August 31, 2020, 2:11am

Aspose.zip (21.5 KB)
Hi Greetings,

My organization uses aspose to manipulate word documents. Whenever we upgrade Aspose facing annoying issues and required more time to troubleshoot as well as implement some work-around to avoid any business interruption. This is also a similar kind of issue but I would hear some good approach from Aspose side. During extracting content from Bookmark using below code, Aspose throws an exception ( current version : aspose-words-20.4-jdk16)

Java Code : Extract Selected Content Between Nodes in Java|Aspose.Words for Java

CompositeNode cloneNode = (CompositeNode) currNode.deepClone( true );

Exception:

Exception in thread “main” java.lang.ClassCastException: com.aspose.words.BookmarkEnd cannot be cast to com.aspose.words.CompositeNode at ExtractContentBetweenBookmarks.extractContent(ExtractContentBetweenBookmarks.java:91)

at ExtractContentBetweenBookmarks.main(ExtractContentBetweenBookmarks.java:41)

The same feature is worked well on 13.4 and organization uses the same java code provided by Aspose in order to extract content from Bookmark. Attached the sample documents.

Test_2.docx is working
Test_1.docx is not working.

tahir.manzoor · August 31, 2020, 9:30am

@dineshvme

Please use the following modified extractContent method to fix this issue. Hope this helps you.

public static ArrayList extractContent(Node startNode, Node endNode, boolean isInclusive) throws Exception {
    // First check that the nodes passed to this method are valid for use.
    verifyParameterNodes(startNode, endNode);

    // Create a list to store the extracted nodes.
    ArrayList nodes = new ArrayList();

    // Keep a record of the original nodes passed to this method so we can split marker nodes if needed.
    Node originalStartNode = startNode;
    Node originalEndNode = endNode;

    // Extract content based on block level nodes (paragraphs and tables). Traverse through parent nodes to find them.
    // We will split the content of first and last nodes depending if the marker nodes are inline
    while (startNode.getParentNode().getNodeType() != NodeType.BODY)
        startNode = startNode.getParentNode();

    while (endNode.getParentNode().getNodeType() != NodeType.BODY)
        endNode = endNode.getParentNode();

    boolean isExtracting = true;
    boolean isStartingNode = true;
    boolean isEndingNode;
    // The current node we are extracting from the document.
    Node currNode = startNode;

    // Begin extracting content. Process all block level nodes and specifically split the first and last nodes when needed so paragraph formatting is retained.
    // Method is little more complex than a regular extractor as we need to factor in extracting using inline nodes, fields, bookmarks etc as to make it really useful.
    while (isExtracting) {
        // Clone the current node and its children to obtain a copy.
        /*System.out.println(currNode.getNodeType());
        if(currNode.getNodeType() == NodeType.EDITABLE_RANGE_START
                || currNode.getNodeType() == NodeType.EDITABLE_RANGE_END)
        {
            currNode = currNode.nextPreOrder(currNode.getDocument());
        }*/
        System.out.println(currNode);
        System.out.println(endNode);

        CompositeNode cloneNode = null;
        ///cloneNode = (CompositeNode) currNode.deepClone(true);

        Node inlineNode = null;
        if(currNode.isComposite())
        {
            cloneNode = (CompositeNode) currNode.deepClone(true);
        }
        else
        {
            if(currNode.getNodeType() == NodeType.BOOKMARK_END)
            {
                Paragraph paragraph = new Paragraph(currNode.getDocument());
                paragraph.getChildNodes().add(currNode.deepClone(true));
                cloneNode = (CompositeNode)paragraph.deepClone(true);
            }
        }

        isEndingNode = currNode.equals(endNode);

        if (isStartingNode || isEndingNode) {
            // We need to process each marker separately so pass it off to a separate method instead.
            if (isStartingNode) {
                processMarker(cloneNode, nodes, originalStartNode, isInclusive, isStartingNode, isEndingNode);
                isStartingNode = false;
            }

            // Conditional needs to be separate as the block level start and end markers maybe the same node.
            if (isEndingNode) {
                processMarker(cloneNode, nodes, originalEndNode, isInclusive, isStartingNode, isEndingNode);
                isExtracting = false;
            }
        } else
            // Node is not a start or end marker, simply add the copy to the list.
            nodes.add(cloneNode);

        // Move to the next node and extract it. If next node is null that means the rest of the content is found in a different section.
        if (currNode.getNextSibling() == null && isExtracting) {
            // Move to the next section.
            Section nextSection = (Section) currNode.getAncestor(NodeType.SECTION).getNextSibling();
            currNode = nextSection.getBody().getFirstChild();
        } else {
            // Move to the next node in the body.
            currNode = currNode.getNextSibling();
        }
    }

    // Return the nodes between the node markers.
    return nodes;
}

dineshvme · August 31, 2020, 11:07am

Thank you So much !!! @tahir.manzoor