c++ssesse4

Zero remaining Bytes after first Zero in SSE Register


For this question, I will use the notation 1 for a byte with all ones (0xFF) and 0 for a byte with all zeros.

I am looking for a way to zero the remaining bytes in a SSE register after the first zero byte using SSE 4.2 intrinsics:

Example Input:
1111'1101'1011'1000
Desired Output:
1111'1100'0000'0000

Please note, that the data should remain in the SSE register. This is an trivial task to do in a simple byte array!


Solution

  • SSE4.2 provides string instructions returning masks or indexes. In your case this should work:

    // generate a mask of `0xff` until first `0` entry of `a`:
    __m128i mask_until_first_zero(__m128i a)
    {
        // checks where `a` and `a` are equal and valid (i.e., until the first `0` byte):
        return _mm_cmpistrm(a, a, _SIDD_UNIT_MASK); // <-- _SIDD_UNIT_MASK to create byte mask instead of bit mask
        //            ^   ^-- `m` to return a mask
        //            `------ `i` for implicit length string
    }