Document Search -

We would like to have some samples for the below using Aspose.Words

  1. Search/Find any text and highlight the words matching
  2. Search/Find the phrases and highlight them inside documents
  3. Find the partial matches for the text phrases and highlight section where the phrase was found.

@nmedishetti You can easily achieve the 1st and 2nd points using Find/Replace functionality. Here is a simple code that allows to achieve this:

Document doc = new Document(@"C:\Temp\in.docx");

FindReplaceOptions opt = new FindReplaceOptions();
opt.UseSubstitutions = true;
opt.ApplyFont.HighlightColor = Color.Yellow;
doc.Range.Replace("Higilight me", "$0", opt);

doc.Save(@"C:\Temp\out.docx");

The 3rd point can be achieved using IReplacingCallback. Here is a simple code:

Document doc = new Document(@"C:\Temp\in.docx");

FindReplaceOptions opt = new FindReplaceOptions();
opt.ReplacingCallback = new HighlightSection();
doc.Range.Replace("Higilight me", "", opt);

doc.Save(@"C:\Temp\out.docx");
private class HighlightSection : IReplacingCallback
{
    public ReplaceAction Replacing(ReplacingArgs args)
    {
        // get the section where the matched text is located.
        Section sect = (Section)args.MatchNode.GetAncestor(NodeType.Section);
        // Highlight text in section where the matched text is located.
        sect.GetChildNodes(NodeType.Run, true).Cast<Run>().ToList()
            .ForEach(r => r.Font.HighlightColor = Color.Yellow);

        return ReplaceAction.Skip;
    }
}

Thank you!

I am trying to find the phrase individual words inside the paragraph and highlight if it matches partially. See the example below

Text to be searched: “printing typesetting survived Letraset”

Document Paragraph to be searched and highlighted as it has all four/a few words:

" Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularized in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum"

@nmedishetti In this case you can either search for each individual work in the phrase or use regular expression to match the partially matched phrase.

Thank you!

Do you have any option to search again for next/previous x number of words within the document once I find a specific word I am looking for.

I have another query too.

Does Aspose.Words has an option to view the docx as a HTML? Docx viewer (html)?

@nmedishetti

No, there is no option to search starting from the given position. But you can search in a separate node, in a paragraph for example.

doc.FirstSection.Body.FirstParagraph.Range.Replace("test", "replacement");

Aspose.Words is a document processing library and it does not provide any UI for viewing documents. But you can convert a document to HTML, or HtmlFixed format to view it in the browser. HtmlFixed format is designed exactly for document viewing purposes.