Find Page Number of Searched Text in Word Document | C# .NET

Hi there ,
I’m using Aspose.Words in . Net . I wanted to know what is the most efficient way to find the page number of a searched Text .I was able to achieve this functionality for PDF documents but did not find a direct approach in Word .

@laxman.claysys,

You can make use of LayoutCollector.GetStartPageIndex, LayoutCollector.GetEndPageIndex and LayoutCollector.GetNumPagesSpanned methods to find the page number of a node in Word document. Please let me know if I can be of any further assistance.

Thanks for the response @awais.hafeez

Can i get a sample code for this in C#. So i have a variable which will be a sentence and i want to search for it in the document and get the page number of that number of that occurance.

@laxman.claysys,

You can build logic on the following code to get the desired output:

string textToSearch = "Third";

Document doc = new Document("C:\\Temp\\input.docx");

FindReplaceOptions opts = new FindReplaceOptions();
opts.ReplacingCallback = new FindPageNumberOfText();
doc.Range.Replace(textToSearch, "", opts);

private class FindPageNumberOfText : IReplacingCallback
{
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text before the match,
        // in this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        // This array is used to store all nodes of the match for further removing.
        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
            (remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node.
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }

        LayoutCollector collector = new LayoutCollector((Document)e.MatchNode.Document);
        int startPage = collector.GetStartPageIndex((Run)runs[0]);

        Console.WriteLine("Page number is {0}", startPage);

        return ReplaceAction.Skip;
    }

    private static Run SplitRun(Run run, int position)
    {
        Run afterRun = (Run)run.Clone(true);
        afterRun.Text = run.Text.Substring(position);
        run.Text = run.Text.Substring((0), (0) + (position));
        run.ParentNode.InsertAfter(afterRun, run);
        return afterRun;
    }
}

Yes , that worked . Thanks for the help @awais.hafeez

1 Like

Just one more question @awais.hafeez ,
If I were to highlight the same searched text in the above scenario ,how can add it to the above code in an efficient manner . .Should i add it to the call back class or outside ?.
Sample code would be much appreciated .

@laxman.claysys,

You can add it inside callback event like this:

...
...
    LayoutCollector collector = new LayoutCollector((Document)e.MatchNode.Document);
    int startPage = collector.GetStartPageIndex((Run)runs[0]);

    Console.WriteLine("Page number is {0}", startPage);

    foreach (Run run in runs)
        run.Font.HighlightColor = Color.Yellow;

    return ReplaceAction.Skip;
}