algorithm time-complexity proof dht kademlia

How to understand the time complexity of Kademlia node operation

I'm now learning Kademlia network by reading the classical paper Kademlia: A Peer-to-peer Information System Based on the XOR Metric. I want to understand the complexity of its operation but still cannot figure it out.

In the 3 Sketch of proof section, the paper gives two definitions:

Depth of a node (h): 160 − i, where i is the smallest index of a non-empty bucket
Node y’s bucket height in node x: the index of the bucket into which x would insert y minus the index of x’s least significant empty bucket.

And three conclusions:

With overwhelming probability the height of a any given node will be within a constant of log n for a system with n nodes.
The bucket height of the closest node to an ID in the kth-closest node will likely be within a constant of log k.
If none of this node’s h most significant k-buckets is empty, the lookup procedure will find a node half as close (or rather whose distance is one bit shorter) in each step, and thus turn up the node in h − log k steps.

So my questions are:

What is "least significant empty bucket" and "most significant k-buckets"?
How to explain the depth and bucket height in visual way?
How to understand the second and third conclusions, say, why log k and h - log k?

Solution

It has been a while since I've actually read the paper, so I'm mostly piecing this together from my implementation experience instead of trying to match the concepts that I have in my head to the formal definitions in the paper, so take the following with a little grain of salt

What is "least significant empty bucket" and "most significant k-buckets"?

That basically refers to the buckets sorted by XOR distance relative to the node's own ID

How to explain the depth and bucket height in visual way?

Each bucket covers a keyspace region. E.g. from 0x0000^{simplified to 2 bytes} to 0x0FFF for 1/16th of the keyspace. This can be expressed in CIDR-like masks, 0x0/4 (4 prefix bits). That's more or less the depth of a bucket.

There are several ways to organize a routing table. The "correct" way is to represent it as tree or sorted list based on the lowest ID represented by a bucket. This approach allows for arbitrary bucket split operations as some routing table optimizations call for and can also be used to implement node multihoming.

A simplified implementation may instead use a fixed-length array and put each bucket at the position of shared prefix bits relative to the node's own ID. I.e. position 0 in the array will have 0 shared prefix bits, it's the most-distant bucket, the bucket covering 50% of the keyspace and the bucket where the most significant bit is the inverted MSB of the node's own ID.

In that case the depth is simply the array position.

How to understand the second and third conclusions, say, why log k and h - log k?

Say you are looking for an ID that is the furthest away from your own node's ID. Then you will only have one bucket covering that part of the keyspace, namely it will cover half the keyspace with the most significant bit differing from yours. So you ask one (or several) nodes from that bucket. By virtue of their ID bits having the first bit in common with your lookup target they are more or less guaranteed to have split that in two or more, i.e. have at least double the keyspace coverage for the target space. So they can provide at least 1 bit better information.

As you query closer nodes to the target they will also have better keyspace coverage near the target region because that's also closer to their own node ID.

Rinse, repeat until there are no closer nodes to be found.

Since each hop shaves off at least 1 bit of distance at a time you basically need a O(log(n)) hop count where n is the network size. Since network size basically dictates the average distance between nodes and thus bucket depth needed for your home bucket.

If the target key is closer to your own ID you will need fewer steps since you will have better coverage of that region of the keyspace.

Since k is a constant (the nodes-per-bucket) so is log k. Double the number of nodes in a bucket and it'll have twice the resolution of the given keyspace region and thus will (probabilistically) yield a node that is one bit closer to the target than a bucket with k/2 size. I.e. you have to double the number of entries per bucket for each additional bit per hop you wish to save.

Edit: Here's a visualization of an actual single-homed bittorrent DHT routing table, sorted by their prefixes, i.e. not relative to the local node ID:

Node ID: 2A631C8E 7655EF7B C6E99D8A 3BF810E2 1428BFD4
buckets: 23 / entries: 173
000...   entries:8 replacements:8
00100...   entries:8 replacements:0
0010100...   entries:8 replacements:2
0010101000...   entries:8 replacements:4
00101010010...   entries:8 replacements:7
001010100110000...   entries:8 replacements:3
0010101001100010...   entries:8 replacements:3
00101010011000110000...   entries:8 replacements:1
001010100110001100010...   entries:3 replacements:0
0010101001100011000110...   entries:6 replacements:0
0010101001100011000111...   entries:6 replacements:0
0010101001100011001...   entries:8 replacements:2
001010100110001101...   entries:8 replacements:1
00101010011000111...   entries:8 replacements:2
00101010011001...   entries:7 replacements:0
0010101001101...   entries:8 replacements:0
001010100111...   entries:8 replacements:0
001010101...   entries:8 replacements:1
00101011...   entries:7 replacements:0
001011...   entries:8 replacements:0
0011...   entries:8 replacements:8
01...   entries:8 replacements:8
1...   entries:8 replacements:8