calgorithmmathbinary

Count ones in a segment (binary)


There is a problem which i am working on it right now and it's as the following :

there are two numbers x1 and x2 and x2 > x1.

for example x1 = 5; and x2 = 10;

and I must find the sum of ones between x1 and x2 in binary representations.

5 = 101   => 2 ones
6 = 110   => 2 ones
7 = 111   => 3 ones
8 = 1000  => 1 one
9 = 1001  => 2 ones
10= 1010  => 2 ones
so the sum will be 
sum = 2 + 2 + 3 + 1 + 2 + 2 = 12 ones;

so I have managed to make a code without even transfer the numbers to binary and wasting execution time.

I noticed that the numbers of ones in every 2^n with n >= 1 is 1 Ex : 2^1 => num of ones is 1 2^2 => 1 2^15 => 1

you can test it here if you want: https://www.rapidtables.com/convert/number/decimal-to-binary.html?x=191

and between each 2^n and 2^(n+1) there are Consecutive numbers as you will see in this example :

      num              number of ones
2^4 = 16                      1
      17                      2
      18                      2
      19                      3
      20                      2
      21                      3
      22                      3
      23                      4
      24                      2
      25                      3
      26                      3
      27                      4
      28                      3
      29                      4
      30                      4
      31                      5
2^5 = 32                      1

so I write a code that can find how many ones between 2^n and 2^(n+1)

int t;                ////turns
int bin = 1;              //// numbers of ones in the binary format ,,, and 1 for 2^5
int n1 = 32;          //// 2^5  this is just for clarification 
int n2 = 64;          //// 2^6
int *keep = malloc(sizeof(int) * (n2 - n1);             ///this is to keep numbers because 
                                      /// i'll need it later in my consecutive numbers
int i = 0;
int a = 0;
n1 = 33               //// I'll start from 33 cause "bin" of 32 is "1";

while (n1 < n2)       /// try to understand it now by yourself
{
    t = 0;
    while (t <= 3)
    {
        
        if (t == 0 || t == 2) 
            bin = bin + 1;
        
        else if (t == 1) 
            bin = bin;
        
        else if (t == 3)
        {
            bin = keep[i];
            i++;
        }
        keep[a] = bin;
        a++;
        t++;
    }
    n1++;
}

anyway as you see I am close to solve the problem but they give me huge numbers and I must find the ones between them, unfortunately I have tried a lot of methods to calculate the "sum" using this above code and I ended up with time execution problem.
Ex: 1, 1000000000 the numbers of ones is >>> 14846928141

so can you give me a little hint what to do next, thanks in advance.

I'm doing this for CodeWar challenge: https://www.codewars.com/kata/596d34df24a04ee1e3000a25/train/c


Solution

  • Here is a proposal for a speedup:

    1. Find smallest y1 such that y1 >= x1 and that y1 is a power of 2

    2. Find largest y2 such that y2 <= x2 and that y2 is a power of 2

    3. Find p1 and p2 such that 2^p1=y1 and 2^p2=y2

    4. Calculate the amount of 1:s between y1 and y2

    5. Deal with x1 to y1 and y2 to x2 separately

    6. Sum the results from 4 and 5

    Let's focus on step 4. Let f(n) be the sum of ones up to (2^n)-1. We can quickly realize that f(n) = 2f(n-1) + 2^(n-1) and that f(1)=1. This can be even further refined so that you don't have to deal with recursive calls, but I highly doubt it will be of any importance. Anyway, f(n) = n2^(n-1)

    To get the result between y1 and y2, just use f(p2)-f(p1)

    For step 5, you can likely use a modified version of step 4.

    EDIT:

    Maybe I was to quick to say "quickly realize". Here is a way to understand it. The amounts of ones up to 2¹-1 is easy to see. The only two binary numbers below 2¹ are 0 and 1. To get the number of ones up to 2² we take the numbers below 2¹ and make a column:

    0
    1
    

    Clone it:

    0
    1
    0
    1
    

    And put 0:s before the first half and 1:s before the second half:

    00
    01
    10
    11
    

    To get 2³ we do the same. Clone it:

    00
    01
    10
    11
    00
    01
    10
    11
    

    And add 0 and 1:

    000
    001
    010
    011
    100
    101
    110
    111
    

    Now it should be easy to see why f(n) = 2*f(n-1) + 2^(n-1). The cloning gives 2f(n-1) and adding the 0:s and 1:s gives 2^(n-1). If 2^(n-1) is hard to understand, remember that 2^(n-1)=(2^n)/2. In each step we have 2^n rows and half of them get an extra 1.

    EDIT 2:

    When I looked at these columns, I got an idea for how to do step 5. Let's say that you want to find the amounts of 1:s from 10 to 15. Binary table for this would be:

    10: 1010
    11: 1011
    12: 1100
    13: 1101
    14: 1110
    15: 1111
    

    Look at the interval 12-15. The last two digits in binary is a copy of the corresponding table for 0-3. That could be utilized, but I leave that to you.

    EDIT 3:

    This was a fun problem. I wrote some python code that does this. I get some problems with too many recursive calls, but that could be solved pretty easily, and it should not be too complicated to convert this to C:

    def f(n):
        return n*2**(n-1)
    
    def numberOfOnes(x):
        if(x==0):
            return 0
        p = floor(log(x,2))
        a = f(p)
        b = numberOfOnes(x-2**p)
        c = x - 2**p +1
        return a+b+c
    

    EDIT 4, Oct 31, 2024, @abhijit-sarkar:

    Let f(n) be the number of 1s until 2^n where n >= 1. Since we clone the numbers from the previous step, and then prepend 1s to half of them, f(n) = f(n-1) + 2^(n-1). The 2nd factor comes from the fact that there are 2^n numbers from 0 until 2^n, and we append 1 to half of them.

    f(n) = 2 * f(n-1) + 2^(n-1)
         = 2 * 2 * f(n-2) + 2 * 2^(n-2) + 2^(n-1)  # Expanding f(n-1)
         = 2 * 2 * 2 * f(n-3) + 2 * 2 * 2^(n-3) + 2 * 2^(n-2) + 2^(n-1)  # Expanding f(n-2)
         = 2^(n-1) + (n-1) * 2^(n-1)  # When i=n-1, f(1)=1
         = n * 2^(n-1)
    

    This formula is used to calculate a.

    -- End EDIT 4

    I made an image so that you easier can understand what a, b and c does in the function numberOfOnes if we call it with numberOfOnes(12):

    enter image description here

    I have finally converted it to C. Of course I have used some code I found here on Stack overflow. I borrowed code for integer versions of log2 and pow, and made some small modifications.

    This code is probably possible to optimize further, but it is not necessary. It is lighting fast, and I was not able to measure it's performance.

    #include <stdio.h>
    #include <math.h>
    #include <assert.h>
    #include <stdint.h>
    #include <inttypes.h>
    typedef uint64_t T;
    
    // https://stackoverflow.com/a/11398748/6699433                                                                    
    const int tab64[64] = {
        63,  0, 58,  1, 59, 47, 53,  2,
        60, 39, 48, 27, 54, 33, 42,  3,
        61, 51, 37, 40, 49, 18, 28, 20,
        55, 30, 34, 11, 43, 14, 22,  4,
        62, 57, 46, 52, 38, 26, 32, 41,
        50, 36, 17, 19, 29, 10, 13, 21,
        56, 45, 25, 31, 35, 16,  9, 12,
        44, 24, 15,  8, 23,  7,  6,  5};
    
    T log2_64 (T value) {
        value |= value >> 1;
        value |= value >> 2;
        value |= value >> 4;
        value |= value >> 8;
        value |= value >> 16;
        value |= value >> 32;
        return tab64[((T)((value - (value >> 1))*0x07EDD5E59A4E28C2)) >> 58];
    }
    
    // https://stackoverflow.com/a/101613/6699433                                                                      
    T ipow(T base, T exp) {
        T result = 1;
        for (;;) {
            if (exp & 1) result *= base;
            exp >>= 1;
            if (!exp) break;
            base *= base;
        }
        return result;
    }
    
    T f(T n) { return ipow(2,n-1)*n; }
    
    T numberOfOnes(T x) {
        if(x==0) return 0;
        
        T p = floor(log2(x));
        T a = f(p);
        T e = ipow(2,p);
        T b = numberOfOnes(x-e);
        T c = x - e + 1;
        return a+b+c;
    }
    
    void test(T u, T v) {
        assert(numberOfOnes(u) == v);
    }
    
    int main() {
        // Sanity checks
        test(0,0);
        test(1,1);
        test(2,2);
        test(3,4);
        test(4,5);
        test(5,7);
        test(6,9);
        // Test case provided in question
        test(1000000000,14846928141);
    }