Free Support Forum - aspose.com

How to copy content of Word Document based on PageBreak

I need to copy first three pages of word document to another document. I am not sure what these first 3 pages will contain they may contain images, tables, paragraphs etc. Is there any way by which I can just cut these first 3 pages and save them in another document.

Hi,

It is possible if the end of the 3rd page coincides with a "new page" section break or explicit page break. The former case is much simpler to implement. Is it applicable to you?

there will be a page break at the end of 3 page. I am least bothered about further content. I just need to save first 3 pages of my document in database.

Here's a sample method which inserts 3 first pages of a source document to the specified node of a destination document:

///

/// Inserts content of the external document after the specified node.

///

/// Node in the destination document where the external document content should be inserted.

/// Document to insert.

public void InsertDocument(Node node, Document doc)

{

CompositeNode parentNode = node.ParentNode;

while (true)

{

if (parentNode == null)

throw new Exception("Document cannot be inserted after the specified node.");

if (parentNode is Story || parentNode is Cell || parentNode is Shape)

break;

node = parentNode;

parentNode = node.ParentNode;

}

int index = node.ParentNode.ChildNodes.IndexOf(node);

Document dstDoc = node.Document;

Section insertedSection;

foreach (Section section in doc.Sections)

{

insertedSection = (Section)dstDoc.ImportNode(section, true, ImportFormatMode.KeepSourceFormatting);

foreach (Node insertedNode in insertedSection.Body.ChildNodes)

{

// Do not insert node if it is a last empty paragarph in the section.

if (insertedNode is Paragraph && insertedNode == section.Body.LastChild && insertedNode.ToTxt().Equals(string.Empty))

break;

if (insertedNode is Paragraph)

{

Paragraph para = (Paragraph)insertedNode;

if ((para.Runs.Count > 0) && (para.Runs[0].Text.StartsWith(ControlChar.PageBreak)))

return;

}

parentNode.ChildNodes.Insert(++index, insertedNode.Clone(true));

}

}

}

Basically, this is a modified version of the document importing algorithm posted in one of the recent threads that halts once an explicit page break is encountered at the beginning of a paragraph. I'm also attaching a sample source document with an explicit page break at the end of the third page.