Can you please point me on the right direction on the following request?
I am trying to split a word document into other documents.. for example my document has different sections: INTRODUCTION,REVIEW AND CONCLUSION. I want to split the section review into a new document.. the document may look like this:
Introducion
this is my introduction
Review
this is my review
Conclusion
this is my conclusion
My new document will only have
Review
this is my review
Keep in mind that the location of the review section may vary base on the document, some documents may have multiple sections so I never now where exactly the review may be.. maybe search for the review word? I am attaching a test document. Please advise.. Thank You ahead of time
Hi
Thanks for your inquiry. Please follow up the code snippet to put each section in separate document.
// open source document.
Document doc = new Document("c:/temp/DocToSplit.docx");
// Loop through all sections.
for (int i = 0; i < doc.Sections.Count; i++)
{
Section section = doc.Sections[i];
// Create empty document.
Document subDoc = new Document();
subDoc.RemoveAllChildren();
// Append section to the empty document.
subDoc.AppendChild(subDoc.ImportNode(section, true, ImportFormatMode.KeepSourceFormatting));
// Save sub document to docx.
subDoc.Save("c:/temp/DocToSplit"+i+".docx");
}
In case any ambiguity, please let me know.
Hi
Moreover, please note that DocumentExplorer is a very useful tool which easily enables us to see the entire document structure. You can find DocumentExplorer in the folder where you installed Aspose.Words e.g. C:\Program Files (x86)\Aspose\Aspose.Words for .NET\Demos\CSharp\DocumentExplorer\bin\DocumentExplorer.exe. Below is the DOM structure of your document as viewed with DocumentExplorer:
Document explorer is showing Introduction, Review and conclusion within single section. I have attached output documents as well.
Can I split it even if it is in the same section? The document is created by the users so I never know if it will be created all in one section or in multiple section
Hi
Thanks for your inquiry. You can extract contents between two bookmarks and save it in separate Word document. I placed two bookmarks like “start” and “end”.
Please follow up the code snippet:
private void ExtractContent(Document srcDoc, string startBookmark, string endBookmark,string outputFile)
{
//Get start and end bookamerks from source document
Bookmark start = srcDoc.Range.Bookmarks[startBookmark];
Bookmark end = srcDoc.Range.Bookmarks[endBookmark];
//If strat of end bookamrk does not exist in the document then exit from the function
if (start == null || end == null)
return;
//Get first Node in the selection
Node startNode = start.BookmarkStart.ParentNode;
while (startNode.ParentNode.NodeType != NodeType.Body)
startNode = startNode.ParentNode;
//Get last Node in the selection
Node endNode = end.BookmarkStart.ParentNode;
while (endNode.ParentNode.NodeType != NodeType.Body)
endNode = startNode.ParentNode;
//Create new document
Document dstDoc = new Document();
Node currNode = startNode;
//Copy content
while (!currNode.Equals(endNode))
{
Node dstNode = dstDoc.ImportNode(currNode, true,ImportFormatMode.KeepSourceFormatting);
dstDoc.FirstSection.Body.AppendChild(dstNode);
//If next node is null we should move to the next section
if (currNode.NextSibling == null)
{
Section nextSection = (Section)currNode.GetAncestor(NodeType.Section).NextSibling;
currNode = nextSection.Body.FirstChild;
}
else
{
//move to next node
currNode = currNode.NextSibling;
}
}
//Save output document
dstDoc.Save(outputFile);
}
I have attached input/output documents. In case of any ambiguity, please let me know.