I have learned about skiplist from Skip Lists: A Probabilistic Alternative to Balanced Trees recently. I think 64 levels with p=0.25 should have 4^64 elements( If p = 1/2, using MaxLevel = 16 is appropriate for data structures containing up to 2^16 elements.--quote from A Probabilistic Alternative to Balanced Trees). But when I check the Redis server.h, I see
define ZSKIPLIST_MAXLEVEL 64 /* Should be enough for 2^64 elements */
define ZSKIPLIST_P 0.25 /* Skiplist P = 1/4 */.
Am I wrong or the config wrong?
You are right. It is fair to say that server.h defines a MAXLEVEL higher than necessary.
The probability that a particular element reaches a level at least k
is p^k
, and the probability that any of the n
elements reaches a level at least k
is n·p^k
.
Redis uses the skip list for sorted sets, and it uses it in conjunction with a hash map. The hash map (dict
) has a capacity of 2^64-1
elements (ready for, not ever tested in practice). Hence the comment on ZSKIPLIST_MAXLEVEL
.
It follows that if we use N = 2^64
as an upper bound on the number of elements in the skip list, Redis could have used ZSKIPLIST_MAXLEVEL = 32.
The nodes allocate memory up the level they actually got when rolling the dice. Only the first, null node uses the ZSKIPLIST_MAXLEVEL. The impact: 512 bytes of extra memory per sorted list (of 129 entries or more - before, it uses a ziplist, see zset-max-ziplist-entries
config setting).
The other side effect is the chance a node would get an unnecessarily high level, but the odds get really low really fast as we go up beyond the 32nds level.
I think it is worth fixing IMO, for those 512 bytes.