So first I'll just describe the task:
I need to:
__m128i
.uint16_t
value (probably using _mm_movemask_epi8
first and then just &
).blend
of the initial values based on the result of that.So the problem is as you might've guessed that blend accepts __m128i
as a mask and I will be having uint16_t
. So either I need some sort of inverse instruction for _mm_movemask_epi8
or do something else entirely.
Some points -- I probably cannot change that uint16_t
value to some other type, it's complicated; I doing that on SSE4.2, so no AVX; there's a similar question here How to perform the inverse of _mm256_movemask_epi8 (VPMOVMSKB)? but it's about avx and I'm very inexperienced with this so I cannot adopt the solution.
PS: I might need to do that for arm as well, would appreciate any suggestions.
When you do _mm_movemask_epi8
after a vector comparison, which produces -1
for true
and 0
for false
, you'll get a 16-bit integer (assuming SSE only) having the n
th bit set for the n
th byte equal to -1
in the vector.
The following is the reverse (inverse?) operation.
static inline __m128i bitMaskToByteMask16(int m) {
__m128i sel = _mm_set1_epi64x(0x8040201008040201);
return _mm_cmpeq_epi8(
_mm_and_si128(
_mm_shuffle_epi8(_mm_cvtsi32_si128(m),
_mm_set_epi64x(0x0101010101010101, 0)),
sel),
sel);
}
Note that you might want to do a bitwise operation with the vector mask converted from an integer mask, without going back and forth between integer ops and vector ops.