bit-manipulationbitwise-operatorsunsigned-integerset-theory

Is there a general approach for optimizing bitwise expressions that distinguish two arbitrary sets of integers?


For context, I need to write a test for an integer between 0 and 7 inclusive, that evaluates to true for {1,3,4,6} and false for {0,2,5,7}. I thought for a couple minutes about whether there might be a short expression combining a few bitwise operations that would do the job (similar to how n & 1 would work on {1,3,5,7}, for example) but none jumped out at me.

In practice, there's no pressing need to use bitwise arithmetic here, something like a switch or lookup table will work fine, but it did lead to me wondering whether there exists a deterministic way to find such an expression: giving a non-zero result when evaluated on numbers from the first set, and zero on numbers of the second set, using only bitwise and, or, not, and xor (so, for example, no boolean && or ==)

I don't have the mathematical background to substantiate it, but it feels reasonable to assume that it should be possible to write some kind of "brute force" expression, whatever the upper bound, where each case is evaluated separately, chained together by |, but is there a more efficient approach, either to simplify a "worst case" implementation or to build one up from nothing?


Solution

  • In general, it is not possible. The restriction to only bitwise and, or, not, and xor means each bit of the result depends only on that bit of the input.

    Suppose we want to distinguish {1, 2} from {0, 3} using such a function f, where the result will be non-zero for {1, 2} and zero for {0, 3}.

    The results f(0) and f(3) are required to be 0. Because bits cannot affect other bits, f(1) must be 1 and f(2) must be 2. Given the results f(1) and f(2), the result f(3) must have its least significant bit equal to the least significant bit of f(1) and its next to least significant bit equal to the next to least significant bit of f(2). Therefore, f(3) must be 3, which means f cannot distinguish the two sets as required.

    An alternative explanation is based on the number of such functions that can exist. The result bit can be 0 or 1 for input 0 and independently 0 or 1 for input 1, giving four possibilities per bit. For three-bit numbers, this means there are 4 * 4 * 4 = 64 possibilities, which is less than the 256 possible subsets of a set of eight items. Therefore, there must be at least 256 - 64 = 192 subsets that cannot be distinguished by such a function.