Consider the following source file test-sha512.c
, using SHA512 intrinsics on Arm64 (aarch64):
#include <arm_neon.h>
const uint64_t data[256] = {0,};
void test()
{
uint64x2_t a = vld1q_u64(data);
a = vsha512h2q_u64(a, a, a);
}
On Ubuntu 22.10 (virtual machine on an MacBook M1), with gcc 12.2.0, I have the error "inlining failed in call to ‘always_inline’" and "target specific option mismatch":
$ gcc -c test-sha512.c -march=armv8-a+sha3
In file included from test-sha512.c:1:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h: In function ‘test’:
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29671:1: error: inlining failed in call to ‘always_inline’ ‘vsha512h2q_u64’: target specific option mismatch
29671 | vsha512h2q_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
| ^~~~~~~~~~~~~~
test-sha512.c:7:9: note: called from here
7 | a = vsha512h2q_u64(a, a, a);
| ^~~~~~~~~~~~~~~~~~~~~~~
/usr/lib/gcc/aarch64-linux-gnu/12/include/arm_neon.h:29671:1: error: inlining failed in call to ‘always_inline’ ‘vsha512h2q_u64’: target specific option mismatch
29671 | vsha512h2q_u64 (uint64x2_t __a, uint64x2_t __b, uint64x2_t __c)
| ^~~~~~~~~~~~~~
test-sha512.c:7:9: note: called from here
7 | a = vsha512h2q_u64(a, a, a);
| ^~~~~~~~~~~~~~~~~~~~~~~
With clang 15.0.6, it compiles correctly and a full SHA512 implementation using Arm64 intrinsics and compiled with clang works correctly.
$ clang -c test-sha512.c -march=armv8-a+sha3
Note: the Arm architecture defines distinct features for SHA1, SHA256, SHA512 and SHA3. However, gcc and clang know crypto
, sha2
and sha3
only. The SHA512 instructions (cryptographically part of SHA2) are activated with sha3
. Weird. Anyway...
The similar Arm64 intrinsics for AES, SHA1 and SHA256 compile correctly with gcc. The problem is specific to SHA512.
Other tests, without success, same error:
-march
option. I tried with all Armv8 options (-march=armv8-a+fp+simd+crypto+crc+lse+fp16+rcpc+rdma+dotprod+aes+sha2+sha3+sm4+fp16fml+sve+profile+rng+memtag+sb+ssbs+predres+sve2+sve2-sm4+sve2-aes+sve2-sha3+sve2-bitperm+tme+i8mm+f32mm+f64mm+bf16+flagm+pauth+ls64+mops
) and with -march=armv9-a
.-march=native
, considering that the M1 supports SHA512.-mcpu=neoverse-v1
or -mcpu=neoverse-n2
or other known Arm cores which support SHA512.gcc --target-help
.Is this a known error? I did not find any reference online about this error on Arm64 SHA512 intrinsics.
EDIT: SHA256 intrinsics also fail with the same error. Only SHA1 intrinsics work. My previous tests with SHA256 were made using clang, sorry.
EDIT 2: SHA256 intrinsics work with -march=armv8-a+sha2+crypto
but not with -march=armv8-a+sha2
. SHA512 still don't work, even with all -march
options.
The SHA3/SHA512 extension is documented by ARM only for ARMv8.2-A onward. As such, gcc requires you to use -march=armv8.2-a+sha3
(or v8.3-a
, etc.)