phpredisphpredis

Get random / any value from Redis hash


I have a Redis-Hash with millions of elements, constantly adding new ones. In php, I run an endless loop to get, process, and delete one element afteh the other. Hereby, I need to get the key of any existing element (preferebly the first one inserted in the hash, FiFo)

while($redis->hlen()) {
    $key = ???
    // process $key    
}

While I know the RANDOMKEY and the SRANDMEMBER command, I did not find any way to get a key of a hash. HGETALL and HKEYSare, due to the size of the hash, not an option either. I need sequential processing. Help appreciated.


Solution

  • There is no trick to access a random item (or the first, or the last) of a given hash object.

    Should you need to iterate on hash objects, you have several possibilities:

    So instead of storing your hash as:

    myobject -> { key1:xxxx, key2:yyyyy, key3:zzzz }
    

    you can store:

    myobject:<hashcode1> -> { key1:xxxx, key3:zzzz }
    myobject:<hashcode2> -> { key2:yyyy }
    ...
    

    To calculate the extra hashcode, you can apply any hash function on your keys which offers a good distribution. On the above example, we assume key1 and key3 have the same hashcode1 value and key2 has the hashcode2 value.

    You can find more information about this kind of data structures here:

    Redis 10x more memory usage than data

    The output cardinality of the hash function should be calculated so that the number of items per hash object is limited to a given value. For instance, if we choose to have 100 items per hash objects, and we need to store 1M items, we will need a cardinality of 10K. To limit the cardinality, simply using a modulo operation on a generic hash function is enough.

    The benefit is it will be compact in memory (ziplist used), and you can easily iterate destructively on the hash objects by pipelining hgetall+del on all of them:

    hgetall myobject:0
    ... at most 100 items will be returned, process them ...
    del myobject:0
    hgetall myobject:1
    ... at most 100 items will be returned, process them ...
    del myobject:1
    ...
    

    You can therefore iterate chunk by chunk with a granularity which is determined by the output cardinality of the hash function.