Hi,
How to read data from attached word file.
10092409.zip (9.0 KB)
For example, you can get/read all text from this Word document by using the following code:
Document doc = new Document("E:\\Temp\\10092409\\10092409.docx");
string allText = doc.ToString(SaveFormat.Text);
Or you can parse content Paragraph by Paragraph by using the following code:
Document doc = new Document("E:\\Temp\\10092409\\10092409.docx");
foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
{
Console.WriteLine(para.ToString(SaveFormat.Text));
}
Please also refer to the following article:
Thank you hafeez for your reply
I tried your code is working fine.
But i need to separate paragraph based on Box given in the document
As this Box data will be fetched in different textboxes.
In this document there is no bookmark . So i am not able to do separate the paragraph
Kindly suggest
10092409.zip (26.1 KB)
Please find attached document with some sample data
These boxes in your document are actually Content Controls (represented by StructuredDocumentTag class in Aspose.Words). You can get Paragraphs contained inside these boxes by using the following code. Hope, this helps.
Document doc = new Document("E:\\Temp\\10092409\\10092409.docx");
Table tab = doc.FirstSection.Body.Tables[0];
int i = 1;
foreach (StructuredDocumentTag sdt in tab.GetChildNodes(NodeType.StructuredDocumentTag, true))
{
if (sdt.Level == MarkupLevel.Block)
{
Console.WriteLine("Box=" + i + " -------------");
i++;
foreach (Paragraph para in sdt.GetChildNodes(NodeType.Paragraph, true))
{
Console.WriteLine(para.ToString(SaveFormat.Text));
}
}
}
Thanks Haffez for your reply.
I am getting data by block wise.
But in one of box there is Table . But data is reading one by one
Can i get full data of Box one time
So that i can set directly to textbox
While using bookmarks i have use following code. which was working perfectly for me
Document htmlDoc = AsposeLicense.GenerateDocument(doc, nodes);
try
{
htmlDoc.FirstSection.Body.FirstParagraph.Remove();
}
catch { }
String sb = htmlDoc.ToString(SaveFormat.Html);
Can i get same syntax for boxes?
Thanks
You can get HTML representation of full/complete ‘content control’ by using the following code:
Document doc = new Document("E:\\Temp\\10092409\\10092409.docx");
Table tab = doc.FirstSection.Body.Tables[0];
int i = 1;
foreach (StructuredDocumentTag sdt in tab.GetChildNodes(NodeType.StructuredDocumentTag, true))
{
if (sdt.Level == MarkupLevel.Block)
{
Console.WriteLine("Box=" + i + " -------------");
i++;
HtmlSaveOptions opts = new HtmlSaveOptions(SaveFormat.Html);
opts.PrettyFormat = true;
opts.ExportImagesAsBase64 = true;
// specify any more HtmlSaveOptions
Console.WriteLine(sdt.ToString(opts));
}
}
Hope, this helps.
Thanks haffez its working fine
Thanks for your support
Thanks for your feedback. In case you have further inquiries or need any help in future, please let us know.