Find functionality to return a Range object

Hi,

I am currently evaluating Aspose.Words to migrate and significantly enhance Macros our users have locally onto a central server (the plan is to run the code as the documents pass through our email gateway). Alot of the existing code makes use of the “Find” object in VBA.

I’ve read the below article, and looked elsewhere in the documentation - but can’t really find an easy way to “Find” within a document. To be clear I do not want to replace any text, but rather return a Range object which points to a specific text string, and then harvest subsequent text strings for storage elsewhere. In time I will probably want to manipulate based on found Ranges, and volumes could be significant so being able to find text efficiently is quite important.
https://docs.aspose.com/words/net/find-and-replace/

The first paragraph seems to suggest “finding” can be done with this method - but doesn’t explain how. It seems more like just Replace. Suppose for example I want to find the first instance of “Author:” and return a Range for it. In VBA this would be;

Set myRange = ActiveDocument.Content
myRange.Find.Execute FindText:=“Author”, Forward:=True 'myRange is redefined to a new range which is the found text.

I had looked at Aspose.Cells first (we also do a similar process on Excel reports) and it does have the kind method I was looking was able to prototype what I’m trying to do quite quickly;

Cell cell = worksheet.Cells.FindStringStartsWith("Author:", null);

Additionally the Find method in VBA is also very useful in that it can Find not just “Author:” string, but also “Author:” strings which are Bold, 12pt, Italic etc. qnd use special characters (i.e. “^p”) to locate the end of paragraphs etc. I anticipate this coming in handy as we begin to enhance our processes server side; so was wondering if this is supported by Aspose.Words in a straightforward way? One of the big judgments of success in the project will be the extent to which we can reduce the code we need to maintain and simplify it so it can be worked on by developers with less risk. I did see some code here (https://docs.aspose.com/words/net/find-and-replace/) which I may be able to re-engineer to achieve what I’m after but it seems like it could end up being alot of code to do something which seems like a common requirement.

I am hoping the functionality is there and I’ve just missed it and there is at least an equivalent to the Aspose.Cells Find functionality in Aspose.Words.

Thanks for your attention to this.

This message was posted using Page2Forum from Find and Replace Overview - Aspose.Words for .NET and Java

Hi

Thanks for your request. There is no Find method in Aspose.Words. I think, you can easily achieve what you need using Range.Replace method and ReplaceEvaluator. Please see the following link for more information:
https://reference.aspose.com/words/net/aspose.words/range/replace/
Best regards.

Hi Alexey,

I’ve had a look at your link but can’t see how this allows me to do anything more than a Replace. I did specifically say I did not want to do a Replace; what I need it to Find a specific set of text strings and have an object returned which allows me then manipulate from there. I do not just want to change parts of the document based on the find but extract other values based on the found text’s location, do some business logic, and then insert extra text elsewhere in the document. If it is possible to create a custom replace evaluator to do a Find I apologise; but that would in any case be quite an unintuitive(!) API approach in my view.

I’ve shown a sample of the VBA code I need to replicate to see if it explains more and you can point me in a better direction;

Dim searchRange As Range
Dim costCode As String
Set searchRange = ActiveDocument.Content

searchRange.Find.Execute ("EX1 Cost Code:") 'myRange is redefined to a new range which is the found text.
costCode = searchRange.Tables(1).Cell(searchRange.Cells(1).RowIndex, searchRange.Cells(1).ColumnIndex + 1).Range.Text

'Other harvesting and business logic here.
Set searchRange = ActiveDocument.Content 'reset range to be searched.
searchRange.Find.Execute ("Disclaimer:^p")
searchRange.InsertAfter ("Newly generated disclaimer text.")

As I said before the reason we are looking to use a external library from Aspose is to simplfy maintainence of our code and move it to the server - I was a bit worried when I looked at the section you reffered me too as that would surely complicate rather than simplfy matters! I am sure however that in a mature product like Aspose Words there must be a simpler solution for this.

I am still working my way through the documentation but any help you can give would help as I need to decide on a way forward by early next week.

Finally - my next task if I can get something working is to look at performance/memory usage versus our other options. Do you have any papers or stats on the performance metrics of your product which could help?

thanks,
RA.

Hi

Thanks for your request and additional information. I think there are two most commonly used usages of find functionality:

  1. Calculate number of occurrences of a particular word or phrase in a document.
  2. Get value, which follows the searched text. For example, if in document we have something like the following:

Product ID: [identifier of the product]
In this case, “Product ID” is a constant, which can be searched, “[identifier of the product]” is a variable, which we need to get from the document.
ReplaceEvaluator gives you great flexibility, so you can easily achieve both these tasks. Here are simple code examples:

  1. Find number of occurrences of a word or phrase:
// Open document.
Document doc = new Document(@"Test001\in.doc");
// Get number of occurrences of the phrase.
OccurrencesCounter counter = new OccurrencesCounter();
int occurrencesCount = counter.CountOccurrences(doc, "Aspose.Words");
Console.WriteLine(occurrencesCount);
private class OccurrencesCounter
{
    ///
    /// Calculates number of occurrences of the specified phrase.
    ///
    public int CountOccurrences(Document doc, string pattern)
    {
        // We use regular expression to search phrase in the document.
        Regex regex = new Regex(pattern, RegexOptions.IgnoreCase);
        // Get number of occurrences of the phrase.
        doc.Range.Replace(regex, new ReplaceEvaluator(MoveNext), true);
        return mOccurrencesCount;
    }
    ///
    /// Just moves to the next occurrence.
    ///
    private ReplaceAction MoveNext(object sender, ReplaceEvaluatorArgs e)
    {
        mOccurrencesCount++;
        return ReplaceAction.Skip;
    }
    private int mOccurrencesCount;
}
  1. The similar approach can be used to get value after searched text. Regular expressions simplify the task very much. Here is a simple example:
// Open document.
Document doc = new Document(@"Test001\in.doc");
// Get product ID. In the document we have text like this: Product ID: 00000
// It is easy to find value of the Product ID using regular expressions.
Regex regex = new Regex(@"Product ID: (?\d+)", RegexOptions.IgnoreCase);
doc.Range.Replace(regex, new ReplaceEvaluator(GetID), true);
private static ReplaceAction GetID(object sender, ReplaceEvaluatorArgs e)
{
    // Get ID.
    string id = e.Match.Groups["id"].Value;
    Console.WriteLine(id);
    return ReplaceAction.Skip;
}

It seems in your case you have a similar task, but in your case, value is in the next cell of a table. It is also easy to get value from the next cell using ReplaceEvaluator. Here is code:

// Open document.
Document doc = new Document(@"Test001\in.doc");
// Get product ID. In the document we have text "Product ID",
// and required value in the next cell.
Regex regex = new Regex(@"Product ID", RegexOptions.IgnoreCase);
doc.Range.Replace(regex, new ReplaceEvaluator(GetID), true);
private static ReplaceAction GetID(object sender, ReplaceEvaluatorArgs e)
{
    // Get table cell where the matched text is located.
    Cell cell = (Cell) e.MatchNode.GetAncestor(NodeType.Cell);
    if (cell == null)
        return ReplaceAction.Skip;
    // Print text from the next cell.
    if (cell.NextSibling != null)
        Console.WriteLine(cell.NextSibling.ToTxt());
    return ReplaceAction.Skip;
}

In addition, you can put your business logic into ReplaceEvaluator.
Hope the examples could help you to achieve what you need.
Best regards.

@alexey.noskov
can you send me OccurrencesCounter.java full class.

@rabin.samanta

Thanks for your inquiry. This forum thread is very old and code contains the obsolete APIs. Could you please share some detail about your requirement? We will then answer your query accordingly.