Extract content with a section break

steve847 · October 3, 2011, 4:08am

I am using the code from the documentation (see link below) in order to extract content in a bookmark from one document and copy it to another.
However, if the content contains a section break (created by going to Page Layout ribbon on office 2007 and selecting ‘Section Break - Next Page’ from the Breaks menu), then the section break does not get extracted.
I tried to debug it and I see that when the node is cloned (in the ExtractContent() method), the Paragraph.IsEndOfSection property of the cloned node is ‘false’ even though the original node has it set to ‘true’ so I think the node does not get cloned properly.
Please verify if this is a bug and let me know if there is a workaround.
https://docs.aspose.com/words/net/how-to-extract-selected-content-between-nodes-in-a-document/

alexey.noskov · October 3, 2011, 5:44am

Hi
Thanks for your request. This is not a bug. The version of the method provided in the documentation simply ignores section breaks. You can try using code provided below:

///
/// Extracts content between nodes
/// nodes should be direct children of main story (body)
///
/// start node
/// end node
///
public Document ExtractContentBetweenNodes(Node startNode, Node endNode)
{
    // Check whether start and end nodes are children of boby
    if (startNode.ParentNode.NodeType != NodeType.Body || endNode.ParentNode.NodeType != NodeType.Body)
        throw new Exception("Start and end nodes should be children of main story(body)");
    // Clone the original document,
    // this is needed to preserve styles of the original document
    Document srcDoc = (Document) startNode.Document;
    Document dstDoc = srcDoc.Clone();
    dstDoc.RemoveAllChildren();
    // Now we should copy parent nodes of the start node to the destination document
    // these will Section, Body etc.
    // First we should get list of parents of the start node
    Node firstSect = dstDoc.ImportNode(startNode.GetAncestor(NodeType.Section), true, ImportFormatMode.UseDestinationStyles);
    dstDoc.AppendChild(firstSect);
    // Remove content from the section, except headers/footers
    dstDoc.LastSection.Body.RemoveAllChildren();
    Node currNode = startNode;
    Node dstNode;
    // Copy content
    while (!currNode.Equals(endNode))
    {
        // Import node
        dstNode = dstDoc.ImportNode(currNode, true, ImportFormatMode.UseDestinationStyles);
        dstDoc.LastSection.Body.AppendChild(dstNode);
        // move to the next node
        if (currNode.NextSibling != null)
            currNode = currNode.NextSibling;
        // Move to the next section
        else
        {
            Node sect = currNode.GetAncestor(NodeType.Section);
            if (sect.NextSibling != null)
            {
                dstNode = dstDoc.ImportNode(sect.NextSibling, true, ImportFormatMode.UseDestinationStyles);
                dstDoc.AppendChild(dstNode);
                dstDoc.LastSection.Body.RemoveAllChildren();
                currNode = ((Section) sect.NextSibling).Body.FirstChild;
            }
            else
            {
                break;
            }
        }
    }
    return dstDoc;
}

Hope this helps.
Best regards,

steve847 · October 3, 2011, 7:38am

Thanks for the quick response.
There seems to be quite a difference in logic between the above version of the method and the version in the documentation provided in my link. What is the difference between the 2 in terms of functionality?
Also, where exactly in the code of the documentation version are section breaks being excluded? perhaps I could just modify that version to work for me and not ignore the section breaks?
You see, I have already integrated that code into our product which has been tested thouroughly and I would like, if possible, to continue using that code as completely new code means I have to restest again (for example, in the version you posted here, there is no cloning of nodes being done which is a big difference in terms of implementation.)
Thanks.

alexey.noskov · October 3, 2011, 9:10am

Hi
Thanks for your request. If you take a look at the method provided in the documentation you can see the following:

// Move to the next node and extract it. If next node is null that means the rest of the content is found in a different section.
if (currNode.NextSibling == null && isExtracting)
{
    // Move to the next section.
    Section nextSection = (Section) currNode.GetAncestor(NodeType.Section).NextSibling;
    currNode = nextSection.Body.FirstChild;
}
else
{
    // Move to the next node in the body.
    currNode = currNode.NextSibling;
}

As you can see when it encounters a section break it simply takes a first node of the next section as a current node. So section breaks are ignored.
Sure you are free to modify both of the methods.
Best regards,

steve847 · October 6, 2011, 5:56am

Hi, thanks for pointing out the code. I managed to modify this piece of code to import the section instead of skipping it. Thanks!

alexey.noskov · October 6, 2011, 10:27am

Hi
It is perfect that you managed to implement what you need. Please feel free to ask in case of any issues, we will be glad to help you.
Best regards,