c++x86bit-manipulationbmi

Select spans of set bits in a bitmask that overlap with a 1-bit in a selector bitmap


Given:

I want to select spans of contiguous 1-bits in a which overlap with a bit in b:

a = 0b1111001110001100;
b = 0b0000001010001000;
//c=0b0000001110001100
//    XXXX  YYY   ZZ

The XXXX group is 0 in c because b & XXXX is false. The ZZ group is copied because b has one of the Z bits set. The YYY group is also set in c for the same reason. Notice that b can have multiple set bits in a single group in a.

So for every contiguous group of 1s in a, set all of those bits in c if b has a 1 in any of those positions. A more complex example:

std::uint64_t a = 0b1101110110101;
std::uint64_t b = 0b0001010010001;
// desired   c == 0b0001110110001
// contiguous groups   ^^^ ^^   ^  that overlap with a 1 in b

assert(a & b == b);           // b is a subset of a

std::uint64_t c = some_magic_operation(a, b);
assert(c == 0b0001110110001);

Are there any bit-logic instructions/intrinsics (MMX, SSE, AVX, BMI1/BMI2), or bit manipulation tricks which allows me to calculate c from a and b efficiently? (i.e. without loops)?


ADDITIONAL:

Using hint from Denis' answer I can only imagine loop-based algorithm:

std::uint64_t a = 0b0110111001001101;
std::uint64_t b = 0b0100101000001101;
assert(a & b == b); // subset

std::cout << std::bitset< 16 >(a) << std::endl;
std::cout << std::bitset< 16 >(b) << std::endl;
std::uint64_t x = (a + b) & ~a;
std::uint64_t c = 0;
while ((x = (a & (x >> 1)))) { // length of longest 1-series times
    c |= x;
}
std::cout << std::bitset< 16 >(c) << std::endl;

Solution

  • In case of uint64_t you may do this trick:

    Let's set a = 0b11011101101. Having at least one 0 bit is important. The bitmask has 4 separate areas, filled with 1 bits. If you do c=a+(a&b), then each 1-filled area will overflow if at least one bit of b in this area is set. So you can check then, which area was overflown. For example, if you want 1-bits in 2-nd and 3-rd areas of a, you may do so:

        assert(c & 0b00100010000);
        //              ^^^ ^^ this segments overflows