In https://www.bittorrent.org/beps/bep_0042.html it states:
The expression to calculate a valid ID prefix (from an IPv4 address) is:
crc32c((ip & 0x030f3fff) | (r << 29)) And for an IPv6 address (ip is the high 64 bits of the address):
crc32c((ip & 0x0103070f1f3f7fff) | (r << 61))
r is a random number in the range [0, 7]. The resulting integer, representing the masked IP address is supposed to be big-endian before hashed. The "|" operator means bit-wise OR.
Why is ip4 and ip6 bitwise and'ed with 0x030f3fff and 0x0103070f1f3f7fff respectively?
The mask introduces a non-linear relationship between which octet one has control over vs. the number of distinct node ID prefixes one can generate.
I think the model here is that if you acquire address-blocks, say 8 /24 prefixes then that gives you 2048 addresses. But whether those addresses are within the same /8 block or spread over many blocks doesn't make as much of a difference, in the end you still control the same number of addresses. So you get 8 bits of entropy from it being /24-chunked and then 3 additional bits to uniquely distinguish those prefixes, not a full 3 bytes.