Free Support Forum - aspose.com

Copy desired paragraphs from one document to another

I have a requirement where I need to search a particular label in the document and copy the paragraphs and paste into desired location into original document.

Is there a way to do it in Aspose.

@wrushu2004

Thanks for your inquiry. Please use NodeImporter.ImportNode method to import a node from one document into another. Following code example shows how to use this method. Hope this helps you.

Document dstDoc = new Document(MyDir + "in.docx");
Paragraph paragraph = (Paragraph)dstDoc.GetChild(NodeType.Paragraph, 0, true);

Document doc = new Document();
NodeImporter importer = new NodeImporter(dstDoc, doc, ImportFormatMode.KeepSourceFormatting);
Node newNode = importer.ImportNode(paragraph, true);
doc.FirstSection.Body.AppendChild(newNode);

doc.Save(MyDir + "out.docx");

I have got the nodecollection from another document can I use that in FindAndReplace example.

@wrushu2004

Thanks for your inquiry. The find and replace example is used to find some text and replace it with some other content or text. Could you please share your input and expected output documents here for our reference? Please also share some more detail about your requirement. We will then provide you more information about your query along with the code.

RangesGetText1.zip (41.6 KB)
package com.aspose.words.examples.programming_documents.Ranges;

import com.aspose.words.Document;
import com.aspose.words.DocumentBuilder;
import com.aspose.words.FindReplaceOptions;
import com.aspose.words.IReplacingCallback;
import com.aspose.words.ImportFormatMode;
import com.aspose.words.License;
import com.aspose.words.Node;
import com.aspose.words.NodeCollection;
import com.aspose.words.NodeType;
import com.aspose.words.ReplaceAction;
import com.aspose.words.ReplacingArgs;
import com.aspose.words.examples.Utils;
import com.sun.org.apache.xpath.internal.compiler.OpCodes;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
import java.util.Map;
import java.util.regex.Pattern;

public class RangesGetText1 {

private static String gDataDir;

public static void main(String[] args) throws Exception {

    RangesGetText1 example = new RangesGetText1();
    //ExStart:RangesGetText
    // The path to the documents directory.
    String dataDir = Utils.getDataDir(RangesGetText1.class);

    License license = new License();
    license.setLicense(dataDir + "Aspose.Words.lic");

    Document docDest = new Document(dataDir + "Reco_Doc_64938.docx");


    Document docSrc = new Document(dataDir + "AB10.docx");

    final NodeCollection pChildNodes = docSrc.getChildNodes(NodeType.PARAGRAPH, true);

    final List<Node> fiscalNodes = example.findMatch(pChildNodes, "FISCAL EFFECT:");
    final List<Node> summaryNodes = example.findMatch(pChildNodes, "SUMMARY:");

    
    final NodeCollection childNodes = docDest.getChildNodes(NodeType.PARAGRAPH, true);
    DocumentBuilder builder = new DocumentBuilder(docDest);
    for (Iterator<Node> iterator = childNodes.iterator(); iterator.hasNext();) {
        Node replaceNode = iterator.next();
        if (replaceNode.getText().startsWith("00000001_Summary_html_3")) {                
            builder.moveTo(replaceNode);
            for (Node summaryNode : summaryNodes) {
                builder.insertNode(summaryNode);

// docDest.insertBefore(summaryNode, replaceNode);
// docDest.removeChild(replaceNode);
// replaceNode = summaryNode;
}
} else if (replaceNode.getText().startsWith(“00000001_Fiscal2_html_2”)) {
for (Node fiscalNode : fiscalNodes) {
builder.insertNode(fiscalNode);
// docDest.insertBefore(fiscalNode, replaceNode);
// docDest.removeChild(replaceNode);
// replaceNode = fiscalNode;
}
}
}
// String text = doc.getText();
// System.out.println(text);
//ExEnd:RangesGetText
}

private List<Node> findMatch(NodeCollection pChildNodes, String match) {
    List<Node> nodes = null;
    int index = 0, startIndex = 0, endIndex = 0, pCount = 0;
    for (Iterator iterator = pChildNodes.iterator(); iterator.hasNext();) {
        Node next = (Node) iterator.next();
        if (startIndex == 0) {
            if (next.getText().startsWith(match)) {
                startIndex = index;
            } else {
                index++;
            }
            pCount = 1;
        } else {
            if (match.equals("FISCAL EFFECT:")) {
                if (next.getText().startsWith("COMMENTS:")
                        || next.getText().startsWith("Analysis Prepared by:")) {
                    break;
                }
            } else if (match.equals("SUMMARY:")) {
                if (next.getText().startsWith("FISCAL EFFECT:")) {
                    break;
                }
            }
            pCount++;
        }
    }
    System.out.println("startIndex:" + startIndex);
    System.out.println("pCount:" + pCount);
    if (pCount > 0) {
        nodes = new ArrayList<>();
        for (int i = 0; i < pCount; i++) {
            System.out.println(pChildNodes.get(startIndex + i).getText());
            nodes.add(pChildNodes.get(startIndex + i));
        }
    }
    return nodes;
}    

}

Here I am finding the paragraphs which I want to insert at particular label using findMatch method.

@wrushu2004

Thanks for your inquiry. As per my understanding, you want to find some text from a document and insert another document at the location of matched text. Please use the following code to get the desired output. Hope this helps you.

Document doc = new Document(MyDir + "PLD_SZ8 Product Legacy 2015-11-2_B_Draft.docx");
FindAndInsertDocument insertDocument = new FindAndInsertDocument(MyDir + "AB10.docx");
FindReplaceOptions options = new FindReplaceOptions();
options.setReplacingCallback(insertDocument);
doc.getRange().replace("FISCAL EFFECT:", "", options);

class FindAndInsertDocument implements IReplacingCallback {
    String docpath;
    FindAndInsertDocument(String path)
    {
        docpath = path;
    }
    public int replacing(ReplacingArgs e) throws Exception {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.getMatchNode();

        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.getMatchOffset() > 0)
            currentNode = splitRun((Run) currentNode, e.getMatchOffset());

        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.getMatch().group().length();
        while ((remainingLength > 0) && (currentNode != null) && (currentNode.getText().length() <= remainingLength)) {
            runs.add(currentNode);
            remainingLength = remainingLength - currentNode.getText().length();

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do {
                currentNode = currentNode.getNextSibling();
            } while ((currentNode != null) && (currentNode.getNodeType() != NodeType.RUN));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0)) {
            splitRun((Run) currentNode, remainingLength);
            runs.add(currentNode);
        }

        DocumentBuilder builder = new DocumentBuilder((Document) e.getMatchNode().getDocument());
        builder.moveTo((Run)runs.get(0));
        builder.insertDocument(new Document(docpath), ImportFormatMode.KEEP_SOURCE_FORMATTING);

        for (Run run : (Iterable<Run>) runs)
            run.remove();
        // Signal to the replace engine to do nothing because we have already done all what we wanted.
        return ReplaceAction.SKIP;
    }

    /**
     * Splits text of the specified run into two runs. Inserts the new run just
     * after the specified run.
     */
    private  Run splitRun(Run run, int position) throws Exception {
        Run afterRun = (Run) run.deepClone(true);
        afterRun.setText(run.getText().substring(position));
        run.setText(run.getText().substring((0), (0) + (position)));
        run.getParentNode().insertAfter(afterRun, run);
        return afterRun;
    }
}

I do not want to insert entire document. We are identifying the list of paragraphs from other document which I achieved through findMatch method.

Then want to insert that List into matching label position at source document.

@wrushu2004

Thanks for your inquiry. In the shared code example, please replace the insertdocument line of code with your code that imports specific paragraphs.

If you still face problem, please share your expected output document here for our reference. We will then provide you more information about your query.

My Code:-
final List fiscalNodes = example.findMatch(pChildNodes, “FISCAL EFFECT:”);
final List summaryNodes = example.findMatch(pChildNodes, “SUMMARY:”);

Above code:-
builder.insertDocument(new Document(docpath), ImportFormatMode.KEEP_SOURCE_FORMATTING);

can you let me know how can I replace as you mentioned above.
Tahir need help to get it working before we move to next stage.
FindAndReplace.zip (44.2 KB)

@wrushu2004

Thanks for your inquiry. In your case, we suggest you following solution.

  1. Extract the content of “FISCAL EFFECT:” from document AB10.docx.
  2. Generate the document from extracted nodes.
  3. Insert the generated document into “Reco_Doc_64925.docx” using FindAndInsertDocument class.

Please refer to the following article about extraction the specific contents from the document and generate the document from those contents.
Extract Selected Content Between Nodes