Extract content between

Hello,

I’m trying to use Aspose to get the text in a .doc file that is contained between two tags, like for instance
Foi what i’ve seen i think i need to use regular expressions, but i can’t seem to understand how exactly am i suppose to use them or even if i should.

Can someone please give me a hint on how to accomplish this?

Regards,

Hi

Thanks for your request. To extract content from document you can use DocumentVisitor class. Please read this How-to from our documentation.
After you get document content you can use Regex to get all text content between .

Regex exp = new Regex(@"(?<=).+?(?=)", RegexOptions.IgnoreCase);
MatchCollection MatchList = exp.Matches(allDocumentContent);
foreach(Match content in MatchList)
{
    Console.WriteLine(content.Value + "\n");
}

Hope this help.

In addition, I think, you can use approach suggested here to extract content between user defined strings:
https://forum.aspose.com/t/99344
However, you should note that there were few changes in Aspose.Words API. So you should use IReplacingCallback instead of ReplaceEvaluator:
https://reference.aspose.com/words/net/aspose.words.replacing/ireplacingcallback/