List numbers are incorrect after extracting content from document using Java

actual.PNG (7.8 KB)
extracted.PNG (22.4 KB)

FYI… I used below source code to extract numbering text from word document.

private static Document extractContent(Node startNode, Node endNode) throws Exception {
// Check whether start and end nodes are children of boby
if (startNode.getParentNode().getNodeType() != NodeType.BODY
|| endNode.getParentNode().getNodeType() != NodeType.BODY)
throw new Exception(“Start and end nodes should be children of main story(body)”);
// Clone the original document,
// this is needed to preserve styles of the original document
Document srcDoc = (Document) startNode.getDocument();
Document dstDoc = srcDoc.deepClone();
dstDoc.removeAllChildren();

    // Now we should copy parent nodes of the start node to the destination document
    // these will Section, Body.
    Node firstSect = dstDoc.importNode(startNode.getAncestor(NodeType.SECTION), true,
        ImportFormatMode.USE_DESTINATION_STYLES);
    dstDoc.appendChild(firstSect);

    // Remove content from the section, except headers/footers
    dstDoc.getLastSection().getBody().removeAllChildren();
    Node currNode = startNode;

    Node dstNode;
    // Copy content
    while (!currNode.equals(endNode)) {
        // Import node
        dstNode = dstDoc.importNode(currNode, true, ImportFormatMode.USE_DESTINATION_STYLES);
        dstDoc.getLastSection().getBody().appendChild(dstNode);

        // move to the next node
        if (currNode.getNextSibling() != null)
            currNode = currNode.getNextSibling();
            // Move to the next section

        else {
            Node sect = currNode.getAncestor(NodeType.SECTION);
            if (sect.getNextSibling() != null) {
                dstNode = dstDoc.importNode(sect.getNextSibling(), true, ImportFormatMode.USE_DESTINATION_STYLES);
                dstDoc.appendChild(dstNode);
                dstDoc.getLastSection().getBody().removeAllChildren();
                currNode = ((Section) sect.getNextSibling()).getBody().getFirstChild();
            } else {
                break;
            }
        }
    }
    return dstDoc;
}

@tamaan456

Please ZIP and attach the input Word document and problematic output document here for testing. Please also share the start and end node of extracted content. We will investigate the issue and provide you more information on it.

doc.zip (7.9 KB)

Hello Tahir,

Thanks for your prompt response, kindly find the attached zip file asked by you.

Note:- We are extracting text from input word doc using bookmark ‘KeyDetailSec’.

Looking for your response.
Thank you.

@tamaan456

We have tested the scenario using the latest version of Aspose.Words for Java 20.8 and have not found the shared issue. So, please use Aspose.Words for Java 20.8. We have attached the output document with this post for your kind reference. 20.8.zip (737 Bytes)

We suggest you please use the code shared in the following article.
Extract Selected Content Between Nodes

Hello Tahir,

Thank you so much, the issue is resolved now.

Actually I am a developer and belong to a corporate, we are the licencee user of Aspose Total.

Thanks once again for your kind response.

@tamaan456

Thanks for your feedback. Please feel free to ask if you have any question about Aspose.Words, we will be happy to help you.