The intel intrinsic functions have the subtype of the vector built into their names. For example, _mm_set1_ps
is a ps
, which is a packed single-precision
aka. a float
. Although the meaning of most of them is clear, their "full name" like packed single-precision
isn't always clear from the function descriptions. I have created the following table. Unfortunately some entries are missing. What are the value of them? Additional questions below the table.
abbreviation | full name | C/++ equivalent |
---|---|---|
ps | packed single-precision | float |
ph | packed half-precision | None** |
pd | packed double-precision | double |
pch | packed half-precision complex | None** |
pi8 | ??? | int8_t |
pi16 | ??? | int16_t |
pi32 | ??? | int32_t |
epi8 | ??? | int8_t |
epi16 | ??? | int16_t |
epi32 | ??? | int32_t |
epi64 | ??? | int64_t |
epi64x | ??? | int64_t |
Additional questions:
epiX
and piX
?pi64
exist?epi64
and epi64x
?** I have found this, but there seems to be no standard way to represent a half precision (complex) value in C/++. Please correct me if this has changed in any way.
The missing versions are at least si128 and si64, used in bitwise operations and [e]pu{8,16,32,64}
for unsigned operations.
epi and pi differ in e
probably meaning extended; epi register target is an 128 bit xmm register, while pi targets 64-bit mmx registers.
pi64 does not exists, because the original mmx instruction set was limited to 32-bit elements; si64 is still available.
The main argument for using epi64x instead of epi64 needs to do with lack of function overloading in C. There was need to provide set/conversion methods both for __m128i _mm_set1_epi64(__m64)
which moves from MMX to XMM and for __m128i _mm_set1_epi64x(int64_t)
working with integers. Additionally it seems that in the rest of the cases the 64x suffix is reserved for modes requiring 64-bit architecture, as in movq
between a register and low half of __m128i
, which could be emulated by multiple instruction, and for something like __int64 _mm_cvtsd_si64x (__m128d a)
, which converts a double to 64-bit register target (not to memory directly).
What I would speculate, is that 'si64' and 'si128' mean scalar integer of width 64/128_, notice that there exists _mm_add_si64
(that is not original SSE intrinsic, that is SSE2 intrinsic extending the original MMX instruction set and using MMX registers). It's si64
, not pi64
, because only one element of the same size as the whole register is involved.
Lastly piN means packed integer of element size N targeting MMX (__m64) and epiN means packed integer of elements size N targeting XMM (__m128i).