cbit-manipulationbitcount

Bit Twiddling in C - Counting Bits


I want to count the bits that are set in an extremely large bit-vector (i.e. 100,000 bits).

What I am currently doing is using a pointer to char (i.e. char *cPtr) to point to the beginning of the array of bits. I then:

1. look at each element of the array (i.e. cPtr[x]),   
2. convert it to an integer (i.e. (int) cPtr[x])   
3. use a 256 element look-up table to see how many bits are set in the given byte (i.e. cPtr[x]). 

It occurs to me that if I use a short int pointer instead (i.e. short int * sPtr), then I will only need half as many look-ups, but with a 65534 element look-up table, which will have its own cost in memory usage.

I'm wondering what is the optimal number of bits to examine each time. Also, if that number is not the size of some preset type, how can I walk down my bit-vector and set a pointer to be ANY arbitrary number of bits past the starting location of the bit array.

I know there are other ways to count bits, but for now I want to be certain I can optimize this method before making comparisons to other methods.


Solution

  • I'm wondering what is the optimal number of bits to examine each time

    The only way to find out is to test. See this question for a discussion of the fastest way to count 32 bits at a time.

    Also, if that number is not the size of some preset type, how can I walk down my bit-vector and set a pointer to be ANY arbitrary number of bits past the starting location of the bit array.

    You can't set a pointer to an arbitrary bit. Most machines have byte-addressing, some can only address words.

    You can construct a word starting with an arbitrary bit like so:

    long wordAtBit(int32_t* array, size_t bit)
    {
        size_t idx = bit>>5;
        long word = array[idx] >> (bit&31);
        return word | (array[idx+1] << (32 - (bit&31));
    }