intelsseintrinsicssse2mmx

What are the names and meanings of the intrinsic vector element types, like epi64x or pi32?


The intel intrinsic functions have the subtype of the vector built into their names. For example, _mm_set1_ps is a ps, which is a packed single-precision aka. a float. Although the meaning of most of them is clear, their "full name" like packed single-precision isn't always clear from the function descriptions. I have created the following table. Unfortunately some entries are missing. What are the value of them? Additional questions below the table.

abbreviation full name C/++ equivalent
ps packed single-precision float
ph packed half-precision None**
pd packed double-precision double
pch packed half-precision complex None**
pi8 ??? int8_t
pi16 ??? int16_t
pi32 ??? int32_t
epi8 ??? int8_t
epi16 ??? int16_t
epi32 ??? int32_t
epi64 ??? int64_t
epi64x ??? int64_t

Additional questions:

  1. Have I missed any?
  2. What is the difference between epiX and piX?
  3. Why does no pi64 exist?
  4. What is the difference between epi64 and epi64x?

** I have found this, but there seems to be no standard way to represent a half precision (complex) value in C/++. Please correct me if this has changed in any way.


Solution

    1. The missing versions are at least si128 and si64, used in bitwise operations and [e]pu{8,16,32,64} for unsigned operations.

    2. epi and pi differ in e probably meaning extended; epi register target is an 128 bit xmm register, while pi targets 64-bit mmx registers.

    3. pi64 does not exists, because the original mmx instruction set was limited to 32-bit elements; si64 is still available.

    4. The main argument for using epi64x instead of epi64 needs to do with lack of function overloading in C. There was need to provide set/conversion methods both for __m128i _mm_set1_epi64(__m64) which moves from MMX to XMM and for __m128i _mm_set1_epi64x(int64_t) working with integers. Additionally it seems that in the rest of the cases the 64x suffix is reserved for modes requiring 64-bit architecture, as in movq between a register and low half of __m128i, which could be emulated by multiple instruction, and for something like __int64 _mm_cvtsd_si64x (__m128d a), which converts a double to 64-bit register target (not to memory directly).

    What I would speculate, is that 'si64' and 'si128' mean scalar integer of width 64/128_, notice that there exists _mm_add_si64 (that is not original SSE intrinsic, that is SSE2 intrinsic extending the original MMX instruction set and using MMX registers). It's si64, not pi64, because only one element of the same size as the whole register is involved.

    Lastly piN means packed integer of element size N targeting MMX (__m64) and epiN means packed integer of elements size N targeting XMM (__m128i).