CPU: i5-1335U (10 Cores, 12M Cache, up to 4.60 GHz)
HDD: Samsung SSD 990 PRO 1TB
OS: Windows 11
Our customers are reporting issues with one of our features being very slow. In this feature we are using Worksheet.Cells.Find along with a complex regex to look for tags in spreadsheet.
I went ahead and created a simple application in .NET Framework 4.8 (which is the framework that we are using in our product). I measured the time it took to run Worksheet.Cells.Find with a complex regex on a large spreadsheet filled with dummy text. I was able to confirm that the performance was poor.
I then created a simple application in .NET 8. When running the exact same code I noticed that it was a huge improvement in performance (about 30x faster in my test).
It would mean a lot to us and our customers if the performance of Worksheet.Cells.Find in .NET Framework 4.8 would be as fast as in .NET 8.
Sample (solution with two different projects and a sample.xlsx): AsposeCellFinder.7z (13.5 KB)
After initial testing, we reproduced the issue as you mentioned by using your sample project with template Excel file. I found performance issue with Worksheet.Cells.Find (having Regex) with .NET framework 4.x (compared to .NET (8.0) Core).
We require thorough evaluation of the issue. We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): CELLSNET-57770
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
We did a test where we replaced Worksheet.Cells.Find (having Regex) in .NET Framework 4.x with a for-loop that checks each indivudal cell against the provided Regex. This provided better performance compared to Worksheet.Cells.Find. Could this be a potential workaround or may I be missing out on something? Also, it makes me curious about what’s going on inside Worksheet.Cells.Find except for matching regexes?
// Possible workaround?
foreach (Cell cell in Worksheet.Cells) {
if (regex.IsMatch(cell.StringValue)) {
matches.Add(cell);
}
}
//Cell cell = null;
//var findOptions = new FindOptions { RegexKey = true };
//while(cell = Worksheet.Cells.Find(regex.ToString(), cell , findOptions)) {
// matches.Add(cell);
//}
We have investigated the issue thoroughly and found the cause of it(comparing with the simple match provided in your last post). When specifing the search key, the key will be refactored according to FindOptions.LookAtType to ensure it can be matched according to the expected type. By default LookAtType’s value is “Contains”, then the search key will be refactored by adding wildcards at the beginning and end of it. Such case makes the regex more complicated and the performance decrease. Because you have specified the exact rule for the provided regex, there is no need to use the additional conditions of LookAtType, so please set it as EntireContent and then you can get better performance:
Sets consent for sending user data to Google for online advertising purposes.
Sets consent for personalized advertising.
Cookie Notice
To provide you with the best experience, we use cookies for personalization, analytics, and ads. By using our site, you agree to our cookie policy.
More info
Enables storage, such as cookies, related to analytics.
Enables storage, such as cookies, related to advertising.
Sets consent for sending user data to Google for online advertising purposes.
Sets consent for personalized advertising.
Cookie Notice
To provide you with the best experience, we use cookies for personalization, analytics, and ads. By using our site, you agree to our cookie policy.
More info
Enables storage, such as cookies, related to analytics.
Enables storage, such as cookies, related to advertising.
Sets consent for sending user data to Google for online advertising purposes.