I encountered the _mm_movemask_epi8
intrinsic in some code and I am trying to understand what exactly it does through an example, as I didn't comprehend entirely what it does from reading the description.
The description states:
Copies the values of the most significant bits from each 8-bit element in a 128-bit integer vector of [16 x i8] to create a 16-bit mask value, zero-extends the value, and writes it to the destination.
In my specific code I pass the following 128 bit sequence to the function: ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
and the resulting uint16_t
is hexadecimal 01
(binary 0000 00001
).
Converting the hex values to binary gives: 11111111 00000000 00000000 00000000
so from the description as I understand it for this example, this function takes the most significant bit from each of the four bytes listed above: 11111111 00000000 00000000 00000000 → 1000
and then zero-extends the value to give 0000000000001000
however the function gives not binary 1000
but binary `0001.
Where did I go wrong?
In my specific code I pass the following 128 bit sequence to the function:
ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
If you pass this with like _mm_loadu_si128
, the value is treated as Little Endian, as usually on x86. So, as 128-bit integer it is 0x00000000'00000000'00000000'000000ff
.
Otherwise, you seem to understand it correctly.