Hi,
For one of my projects, I had a requirement where the data “fields” from word documents had to be extracted and exported to a database. May I know how should i start? I have tried the demo. However, I still have doubts with the implementation. All the fields is in a table of different columns and rows.
- The demo seems to be able to extract data from one word file only. Is it possible to extract data from multiple word file with different filename? For example, Test1.doc and Test2.doc.
- The demo doesn’t seems to find my word file with “fields” it shows the error “file not found”.
I’m thinking of using Aspose word to “literate” the word files, read the “field” contents and save it to the respective columns in “database”. Is this correct?
Hi Koh,
Thanks for your inquiry. Could you please share your input Word document along with data “field” detail that you want to extract from document? We will then provide you more information about your query along with code.
Hi,
I have attached the file for your review.
Thank you.
Best Regards,
Wee Liang Koh
Hi Wee,
Thanks for sharing the document. Your document contains the form fields. FormField class represents a single form field. You can get the name and value of form field using this class. Please check the following code snippet. Hope this helps you.
Document doc = new Document(MyDir + "Sample.docx");
foreach (FormField field in doc.Range.FormFields)
{
if (field.Type == FieldType.FieldFormTextInput)
Console.WriteLine(field.Name + " : " + field.Result);
}
Hi, Thanks for the code. It seems like it recreate another duplication of the document. May i know if Aspose can extract the values in each fields and save it to the respective table columns in database?
Hi Wee,
Thanks for your inquiry. We suggest you please read about document object model of Aspose.Words
Aspose.Words Document Object Model
With Aspose.Words you can perform document processing tasks. You can extract contents from document. However, Aspose.Words does not offer APIs to store data into database. Please read the members of FormField class.
Could you please share what contents you want to extract from input document? We will share the code example according to your requirements.
Hi, I was able to read the fields and saved it to database. I also have a dropdownlist field which I save the code based on the description.
Works
else if (field.Type == FieldType.FieldFormDropDown && field.Name.ToLower() == "org_country")
{
company.Address_Country = country.Where(x => x.Description.ToLower() == doc.Range.FormFields["org_country"].Result.Trim().ToLower()).Select(x => x.Code).FirstOrDefault();
}
But when I try it with a different dropdownlist, it always retrieve null values.
Doesn’t work
else if (field.Type == FieldType.FieldFormDropDown && field.Name.ToLower() == "org_type")
{
company.TypeOfCompany = companycode.Where(x => x.Description.ToLower() == doc.Range.FormFields["org_type"].Result.Trim().ToLower()).Select(x => x.Code).FirstOrDefault();
}
Hi Wee,
Thanks for your inquiry. Please use FormField.DropDownItems property to access the items of a drop-down form field as shown in following code example. Hope this helps you.
Document doc = new Document(MyDir + "in.docx");
foreach (FormField formField in doc.Range.FormFields)
{
if (formField.Type == FieldType.FieldFormDropDown)
{
Console.WriteLine(formField.DropDownItems[formField.DropDownSelectedIndex]);
}
}