Extract content (with formatting) from a file

prasanth.skumar · May 28, 2008, 12:52am

Hi,
TO extract the content of a document i used the following code snippet. (The purpose was to insert this to a new document, the particular code is commented here). The particular code is not reading the text from the body of the attached document. If i remove the page number from the document footer, the code works. Can you help me with this. Should i specifically go to the body section and read the text from there?

oStartBody = srcDoc.GetChild(NodeType.Paragraph, 0, true);
string csNodeText;
while (currentNode != null)
{
    // paleceHolder.ParentNode.InsertBefore(m_oOutDoc.ImportNode(currentNode, true, ImportFormatMode.KeepSourceFormatting), paleceHolder);
    csNodeText = currentNode.GetText();
    currentNode = currentNode.NextSibling;
}

regards
Prasanth

alexey.noskov · May 28, 2008, 3:46am

Hi
Thanks for your request. If you need insert the whole document into another document then please see the following link.
https://docs.aspose.com/words/java/insert-and-append-documents/
If you need just merge two documents together then you can try using the following code:

Document dstDoc = new Document("doc1.doc");
Document srcDoc = new Document("doc2.doc");
foreach (Section srcSection in srcDoc.Sections)
{
    Node dstSection = dstDoc.ImportNode(srcSection, true, Aspose.Words.ImportFormatMode.KeepSourceFormatting);
    dstDoc.AppendChild(dstSection);
}
dstDoc.Save("out.doc");

I hope this could help you.
Best regards.

awais.hafeez · July 24, 2019, 12:28pm

A post was split to a new topic: Extract all formatted content from a word document which has track changes