python algorithm math combinatorics counting

Formula for the Nth bit_count (or population count) of size M

I am interested in finding a simple formula for determining the nth time a particular bit_count occurs in the sequence of natural numbers. Specifically, what is the relationship between K and N in the table below. So, for example, what is the N of the K=123456789123456789123456789, I can tell you the M is 50, but what is the N?

length = 5
for K in range(2**length):
    bits = bin(K)[2:].zfill(length)
    M    = K.bit_count() # numbers of "1"s in the binary sequence
    N    = sum(1 for i in range(K) if M==i.bit_count())
    print(f'{K: <2}',bits,M,N)

'''
K  bits  M N
0  00000 0 0
1  00001 1 0
2  00010 1 1
3  00011 2 0
4  00100 1 2
5  00101 2 1
6  00110 2 2
7  00111 3 0
8  01000 1 3
9  01001 2 3
10 01010 2 4
11 01011 3 1
12 01100 2 5
13 01101 3 2
14 01110 3 3
15 01111 4 0
16 10000 1 4
17 10001 2 6
18 10010 2 7
19 10011 3 4
20 10100 2 8
21 10101 3 5
22 10110 3 6
23 10111 4 1
24 11000 2 9
25 11001 3 7
26 11010 3 8
27 11011 4 2
28 11100 3 9
29 11101 4 3
30 11110 4 4
31 11111 5 0
...
'''

So, I appear to have solved half of it. It appears that

N = (K-sum(2**i for i in range(M)).bit_count()

whenever N<=M. This appears to be because,

K = sum(2**i for i in range(M)) + sum(2**(M-1-i) for i in range(N))

again, whenever N<=M. It appears that N<=M occurs about half the time.

length = 5
for K in range(2**length):
    bits = bin(K)[2:].zfill(length)
    M    = K.bit_count() # numbers of "1"s in the binary sequence
    N    = sum(1 for i in range(K) if M==i.bit_count())
    if N <= M:
        A = sum(2**i for i in range(M)) + sum(2**(M-1-i) for i in range(N))
        B = (K - sum(2**i for i in range(M))).bit_count()
    else:
        A = '-'
        B = '-'
    print(f'{K: <2}',bits,M,N,A,B)

Solution

I don't know Python so this is Ruby. The algorithm should be easy to translate. The idea is, for each set bit (going from smaller to larger), count all numbers which are the same to the left of that bit, have the same number of set bits at that bit and to its right, but don't have that bit set.

Traverse your input's bits from smallest to largest.
Keep track of the set bits seen and the total bits seen.
Count all numbers with the same number of set bits seen but a smaller max set bit (than the current index) each time you encounter a set bit.
E.g., when reaching the smaller 1 in 10100, we add 2 to the count because there are 2 numbers with 1 set bit and a smaller max set bit: 10 and 01.

Code:

def g(k)
  set_bits_seen = 0
  all_bits_seen = 0
  
  total = 0
  
  while k > 0
    all_bits_seen += 1
    
    if k % 2 == 1 # cur bit is set
      set_bits_seen += 1
      if all_bits_seen > set_bits_seen
        total += n_choose_k(all_bits_seen - 1, set_bits_seen)
      end
    end
    k /= 2
  end
  return total
end

def n_choose_k(n, k)
  return 1 if k == 0 or k == n
  (k + 1 .. n).reduce(:*) / (1 .. n - k).reduce(:*)
end

Results:

1.upto(20) do |i|
  puts "#{i.to_s(2)}  --  #{g(i)}"
end

1  --  0
10  --  1
11  --  0
100  --  2
101  --  1
110  --  2
111  --  0
1000  --  3
1001  --  3
1010  --  4
1011  --  1
1100  --  5
1101  --  2
1110  --  3
1111  --  0
10000  --  4
10001  --  6
10010  --  7
10011  --  4
10100  --  8

And for your large example:

> g(123456789123456789123456789)
=> 3594960708495168399327022