As far as I know, integers in C++ can be treated like booleans, and we can have a code like this:
int a = 6, b = 10;
if (a && b) do something ---> true as both a and b are non-zero
Now, assume that we have:
__m256i a, b;
I need to apply logical_and (&&) for all 4 long variables in __m256i, and return true if one pair is non-zero. I mean something like:
(a[0] && b[0]) || (a[1] && b[1]) || ...
Do we have a fast code in AVX or AVX2 for this purpose?
I could not find any direct instruction for this purpose, and definitely, using the bitwise and (&) also is not the same.
You can cleverly combine a vpcmpeqq
with a vptest
:
__m256i mask = _mm256_cmpeq_epi64(a, _mm256_set1_epi64x(0));
bool result = ! _mm256_testc_si256(mask, b);
The result
is true if and only if (~mask & b) != 0
or
((a[i]==0 ? 0 : -1) & b[i]) != 0 // for some i
// equivalent to
((a[i]==0 ? 0 : b[i])) != 0 // for some i
// equivalent to
a[i]!=0 && b[i]!=0 // for some i
which is equivalent to what you want.
Godbolt-link (play around with a
and b
): https://godbolt.org/z/aTjx7vMKd
If result
is a loop condition, the compiler should of course directly do a jb
/jnb
instruction instead of setnb
.