language-agnosticbit-manipulationdhtbitfoo

Easiest way to find the correct kademlia bucket


In the Kademlia protocol node IDs are 160 bit numbers. Nodes are stored in buckets, bucket 0 stores all the nodes which have the same ID as this node except for the very last bit, bucket 1 stores all the nodes which have the same ID as this node except for the last 2 bits, and so on for all 160 buckets.

What's the fastest way to find which bucket I should put a new node into?

I have my buckets simply stored in an array, and need a method like so:

Bucket[] buckets; //array with 160 items

public Bucket GetBucket(Int160 myId, Int160 otherId)
{
    //some stuff goes here
}

The obvious approach is to work down from the most significant bit, comparing bit by bit until I find a difference, I'm hoping there is a better approach based around clever bit twiddling.

Practical note: My Int160 is stored in a byte array with 20 items, solutions which work well with that kind of structure will be preferred.


Solution

  • Would you be willing to consider an array of 5 32-bit integers? (or 3 64-bit integers)? Working with whole words may give you better performance than working with bytes, but the method should work in any case.

    XOR the corresponding words of the two node IDs, starting with the most significant. If the XOR result is zero, move on to the next most significant word.

    Otherwise, find the most significant bit that is set in this XOR result using the constant time method from Hacker's Delight.. This algorithm results in 32 (64) if the most significant bit is set, and 1 if the least significant bit is set, and so on. This index, combined with the index of the current word, will will tell you which bit is different.