Extracting Specific Text From Word Doc


We want to use Aspose.Words to extract specific text from some Word documents. The documents are formatted as follow: (also see attached file)


First Name: Suzanne
Last Name: Test

Home Phone: 905-123-4567
Work Phone: 905-234-5678
Other Phone: 905-333-2222
Comments: Oct/07: Comment line1.
Aug 10/07: Comment line2.
Aug 12/03: Comment line3

Some more text


Is there any way to search for First Name: and retreive the text "Suzanne", search for Comments: and retreive the 3 lines of text, and etc.?

Thanks for your help.



Thank you for your interest in Aspose.Words. I think that you can achieve this using regular expressions and ReplaceEvaluator. For example see the following code. This code extracts first name.

public void TestReplaceEvaluator_109307()


//Open document

Document doc = new Document(@"458_109307_queuesystems\in.doc");

//Create regular expression

Regex regex = new Regex(@"First Name:(?.*?)\r");

//Find string

doc.Range.Replace(regex, new ReplaceEvaluator(ReplaceAction_109307), true);


static ReplaceAction ReplaceAction_109307(object sender, ReplaceEvaluatorArgs e)


//Get First name from document

string firstName = e.Match.Groups["value"].Value;

return ReplaceAction.Skip;


The following Regex you can use for extracting comments.

Regex regex = new Regex(@"Comments:(?.*?)\f");

As you can see “\r” – paragraph break character, and “\f” – page break character.

I hope that this will help you.

Best regards

Thanks for the quick response!

Your code work great!

Hi Alexey,

Do you know how I can setup the regex from Comments: to the end of the document? Is there an end of document special char?

Regex regex = new Regex(@"Comments:(?.*?)\END?");




I think that you can try using the following Regex.

Regex regex = new Regex(@"Comments:(?.*)");

Hope this helps.

Best regards.