Pulling content from one document to a new one- tables missing

ChadYo · December 24, 2012, 10:37am

Hello,

I'm trying to pull content from a word document and put it into a new one. I have a tag in my document identified as "[START_CONTENT_TAG]" where the content starts and "[END_CONTENT_TAG]" where it ends. So I just want to grab any content that's in between those two tags. I have it working for the most part but it's not bringing in tables. Should I be using something other than Paragraphs to grab all of the content? Here's my code:

FileStream docStream = new System.IO.FileStream("myFile.doc", FileMode.Open, FileAccess.Read, FileShare.ReadWrite);

Document sourceDocument = new Document(docStream);
NodeImporter importer = new NodeImporter(sourceDocument, dstDoc, ImportFormatMode.KeepSourceFormatting);
NodeCollection paragraphs = sourceDocument.GetChildNodes(NodeType.Paragraph, true);
bool start = false;
bool end = false;
foreach (Paragraph paragraph in paragraphs)
{
if (paragraph.ToTxt().ToUpper().Contains("[END_CONTENT_TAG]"))
{
end = true;
}
if (start && !end)
{
Node importNode = importer.ImportNode(paragraph, true);
dstDoc.FirstSection.Body.AppendChild(importNode);
}
if (paragraph.ToTxt().ToUpper().Contains("[START_CONTENT_TAG]"))
{
start = true;
}
}
dstDoc.FirstSection.Body.AppendChild(new Paragraph(dstDoc));
dstDoc.Save(tempFilePath);
}

awais.hafeez · December 25, 2012, 10:03pm

Hi Chad,

Thanks for your inquiry. I think, to extract all other document elements (e.g. Shapes, Tables, Paragraphs etc) that are enclosed in between your custom ‘Start’ and ‘End’ tags, you need to implement the following workflow:

Find the node (in Aspose.Words’ DOM) which represents the starting tag i.e. [START_CONTENT_TAG]
Find the node which represents the ending tag i.e. [END_CONTENT_TAG]
You can then use the code suggested in this article for extracting content between these ‘Start’ and ‘End’ nodes

I hope, this helps. Please let me know if I can be of any further assistance.

Best regards,