I was expecting to find functions like
__builtin_ia32_fmaddps512
in a recent GCC to enable use of 512 bit AVX512 registers in the same way that one can use the 256 bit AVX2 registers, but they do not exist in GCC 9.2, according to the manual. Is it just a matter of waiting, or is there some policy reason why they don't exist?
AVX512 builtins take a mask (which can be -1
).
The portable intrinsic _mm512_fmadd_ps
(#include <immintrin.h>
) is defined in GCC9.1's headers as:
extern __inline __m512
__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
_mm512_fmadd_ps (__m512 __A, __m512 __B, __m512 __C)
{
return (__m512) __builtin_ia32_vfmaddps512_mask ((__v16sf) __A,
(__v16sf) __B,
(__v16sf) __C,
(__mmask16) -1,
_MM_FROUND_CUR_DIRECTION);
}
I found this by looking in /usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/include/avx512*.h
on my system. (Don't include those directly, only from immintrin.h
)
IDK why you'd want to use __builtin_ia32_vfmaddps512_mask
instead of one of Intel's intrinsics like _mm512_mask_fmadd_ps
(merging into the first operand) or _mm512_mask3_fmadd_ps
(merging into the +c
operand), or _mm512_maskz_fmadd_ps
(zero-masking).
Or even the full _mm512_maskz_fmadd_round_ps
which also allows specifying a rounding override as well as masking.
But anyway, that's how you can find the names of the real GCC builtins beneath any Intel intrinsic (if there is one).