redis

Why Redis SCAN would return an empty result when the same pattern used with KEYS returns something?


The KEYS command returns some results:

> keys Types/*/*BackgroundJob.json
1) "Types/Xyz.Data/Xyz.Data.BackgroundJobEngine.BackgroundJob.json"
2) "Types/Xyz.Web.SystemAdmin/Xyz.Web.SystemAdmin.Models.Encryption.EncryptionMethodByBackgroundJob.json"
3) "Types/BackgroundJobs/SharpTop.Engine.BackgroundJobs.AutofillBackgroundJob.json"
4) "Types/Quartz.Server/BJE.UDT.BackgroundJob.json"
5) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitCompensationPublishBackgroundJob.json"
6) "Types/SpecFlowTest.Architecture.Base/SpecFlowTest.Architecture.Base.Model.IntStudioConfigBackgroundJob.json"
7) "Types/SpecFlowTest.Benefits.UI/SpecFlowTest.Benefits.UI.Base.Services.BackgroundJobsService+BackgroundJob.json"
8) "Types/Xyz.WFM.ExpressionService.Client/Xyz.WFM.ExpressionService.Client.BackgroundJob.ExpressionManagerBackgroundJob.json"
9) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitGenerateBudgetWorksheetBackgroundJob.json"
10) "Types/DFControllersTest.Compensation/DFControllersTest.Compensation.SubmitCompensationUnPublishBackgroundJob.json"
11) "Types/IntegrationStudio/IntegrationStudio.DAL.Entities.IntStudioConfigBackgroundJob.json"
12) "Types/IntegrationStudio/IntegrationStudio.DAL.Entities.BackgroundJob.json"

But the SCAN using the same pattern returns none:

> scan 0 match Types/*/*BackgroundJob.json
1) "1966080"
2) (empty list or set)

I tried to follow the returned cursor value several iterations, but without scripting it to check it through, it seems endless series of empty results.

What is going on?

Edit 1

I finally decided to code it:

private async IAsyncEnumerable<string> QueryRedisAsync(string pattern, [EnumeratorCancellation] CancellationToken ct = default)
{
    var db = connection.GetDatabase();
    var cursor = "0";
    int count = 0;
    do
    {
        ++count;
        ct.ThrowIfCancellationRequested();

        var tmp = await db.ExecuteAsync("SCAN", cursor, "MATCH", pattern, "COUNT", "1000");
        var scanResult = (RedisResult[])tmp;
        cursor = scanResult[0].ToString();
        var keys = (RedisKey[])scanResult[1];

        foreach (var key in keys)
        {
            yield return key.ToString();
        }
    } 
    while (cursor != "0");
    Console.WriteLine(count);
}

The code performed 1058 (!) iterations where exactly one match was found on some iteration, namely:

  1. 173
  2. 189
  3. 242
  4. 351
  5. 416
  6. 473
  7. 590
  8. 912
  9. 975
  10. 983
  11. 998
  12. 1027

So, I used SCAN in order to be "nice" and it caused 1058 round trips to the server.

Am I doing something wrong?

Possible duplicate

I don't think this is a duplicate of redis scan returns empty results but nonzero cursor. It does not seem reasonable to exercise 1K+ round-trips to the server for getting just a few results.


Solution

  • The KEYS command behaves totally different from SCAN command.

    KEYS command iterates all keys in Redis, and filter keys matching your given pattern. That's why a single round trip gives you the answer. However, when running KEYS command, Redis blocks, and cannot process other command. So it's a bad idea to use KEYS command in production env, especially when you have a large dataset.

    SCAN command also iterates the keys in Redis. However, for each scan, it only checks a few keys (you can use the count parameter to control the number of keys), filters keys matching your pattern, and returns. So you need to do multiple round trips to iterate all keys in Redis. Since each scan operation only checks a few keys, it won't block Redis for a long time. And that's the recommended way to scan the key space.

    The code performed 1058 (!) iterations where exactly one match was found on some iteration, namely

    Because you have a large dataset, and there're only a few keys matching your pattern (a small proportion). The first 1057 scans do not get a key matching the pattern.

    So, I used SCAN in order to be "nice" and it caused 1058 round trips to the server. Am I doing something wrong?

    YES, SCAN is nicer than KEYS, especially when you need to scan all keys in Redis (no pattern specified, or a large portion of keys match the pattern).

    However, in your case, a better solution is to create a secondary index for the keys matching the pattern. Say, you can save these keys in a Redis SET, and scan the SET to get the keys.