javalong-integerbitcount

How does Long.bitCount() finds the number of set bits?


I know this is the code. But I'm not able to understand what it does

 `public static int bitCount(long i){
         i = i - ((i  > > > 1) & 0x5555555555555555L);
         i = (i & 0x3333333333333333L) + ((i  > > > 2) & 0x3333333333333333L);
         i = (i + (i  > > > 4)) & 0x0f0f0f0f0f0f0f0fL;
         i = i + (i  > > > 8);
         i = i + (i  > > > 16);
         i = i + (i  > > > 32);
       return (int)i & 0x7f;
 }`

Solution

  • Let's take 255 as an example. The bits are combined as we go along. First we start with 255 as 0b1111.1111 (8 1's in binary)

    The first line of code is:

    i = i - ((i  > > > 1) & 0x5555555555555555L);
    

    This line is combing every pair of 1's. Since we have 8 1's, we expect to combine our pairs, and get back something like 2,2,2,2. Since it's binary we'd expect 10101010.

    Let's look at i > > > 1. i was 0b1111.1111 and this is shifting down 1, so we get 0b0111.1111. We take the intersection, &, with 0b0101.0101 (this is from 5 being 101 in binary). Doing this keeps half of our bits, specifically all the ones that were initially at an even spot (the 2nd, 4th, 6th, 8th bits from our initial number).

    We then subtract this from our initial value, which is a bit hacky. We are trying to add our top bits to our bottom bits, so we could just do that:

    ((i > > > 1) & 0x5555) + (i & 0x5555)
    

    The term on the left would be the top bits, and the term on the right would be the bottom bits. But we know i = 2*(top bits) + 1*(bottom bits), because top bits are upshifted by 1 (which is the same as multiplying by 2). So by subtracting the top bits 1 time, we are getting the same result.

    Okay, so now we are ready for the second line of code. We currently have 0b1010.1010 for i, and we are ready to add each pair of 2. We expect to get 4,4 (4 bits used in each half), or 0100.0100 in binary. The code is:

    i = (i & 0x3333333333333333L) + ((i  > > > 2) & 0x3333333333333333L);
    

    We are getting the top 2 numbers in each group of 4, and the bottom 2 numbers, and we are adding them. 0x3333 = 0b0011.0011.0011.0011 so we can see that taking the intersection, &, with 3's keeps the bottom 2 numbers in a group. We get the bottom two numbers first, and then we shift i over 2 spots to get the top 2 numbers. And then we add: 0010.0010 + 0010.0010 = 0100.0100. Exactly as expected.

    Next we add the 2 groups of 4 together.

     i = (i + (i  > > > 4)) & 0x0f0f0f0f0f0f0f0fL;
    

    0x0f0f = 0b0000111100001111, so if we take the intersection with this, we are keeping every 4 numbers. We add i to itself downshifted 4, so we compute 0100.0100 + 0000.0100 = 0100.1000. 4 + 4 should give back 8, and 8 = 0b1000, but the top "4" still needs to be removed. The intersection with 0f0f0f0f does this.

    So now we have 0b1000, which is 8. The rest of the steps add in even higher bits (like 2 groups of 8 together, than 2 groups of 16..) but since our number (255) was only 8 bits long, the higher bits were all 0s, so this won't affect our result.