I've been trying to search on google but couldn't find anything useful.
typedef int64_t v4si __attribute__ ((vector_size(32)));
//warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
// so isn't AVX already automatically enabled?
// What does it mean "without AVX enabled"?
// What does it mean "changes the ABI"?
inline v4si v4si_gt0(v4si x_);
//warning: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
//So why there's warning and what does it mean?
// Why only this parameter got warning?
// And all other v4si parameter/arguments got no warning?
void set_quota(v4si quota);
That's not legacy code. __attribute__ ((vector_size(32)))
means a 32 byte vector, i.e. 256 bit, which (on x86) means AVX. (GNU C Vector Extensions)
AVX isn't enabled unless you use -mavx
(or a -march
setting that includes it, preferably -march=x86-64-v3
to enable the other goodies normally found on AVX2 CPUs, or -march=native
). Without that, the compiler isn't allowed to generate code that uses AVX instructions, because those would trigger an illegal-instruction fault on older CPUs that don't support AVX.
So the compiler can't pass or return 256b vectors in registers, like the normal calling convention specifies. Probably it treats them the same as structs of that size passed by value.
See the ABI links in the x86 tag wiki, or the x86 Calling Conventions page on Wikipedia (mostly doesn't mention vector registers).
Features like AVX2 can also be enabled on a per-function basis with __attribute__((target("avx2")))
or apparently with a pragma. GCC's immintrin.h
does define the necessary types and intrinsic functions even if you don't enable it at compile time, to support per-function target overrides.
This is why the error message if you don't do this is that inlining failed because of "target specific option mismatch" rather than undefined function: The Effect of Architecture When Using SSE / AVX Intrinisics
Since the GNU C Vector Extensions syntax isn't tied to any particular hardware, using a 32 byte vector will still compile to correct code. It will perform badly, but it will still work even if the compiler can only use SSE instructions. (Last I saw, gcc was known to do a very bad job of generating code to deal with vectors wider than the target machine supports. You'd get significantly better code for a machine with 16B vectors from using vector_size(16)
manually.)
Anyway, the point is that you get a warning instead of a compiler error because __attribute__ ((vector_size(32)))
doesn't imply AVX specifically, but AVX or some other 256b vector instruction set is required for it to compile to good code.