Consider this program on godbolt:
#include <cassert>
#include <cstdint>
int64_t const x[] = { 42, 2, 3, 4 };
int64_t f() {
int64_t i;
asm volatile (
"xor %[i], %[i]\n\t"
"movq _ZL1x(,%[i],8), %[i]"
: [i]"=r"(i));
return i;
}
int64_t g() {
int64_t i;
asm volatile (
"xor %[i], %[i]"
: [i]"=r"(i));
i = x[i];
return i;
}
int main() {
assert(f() == 42);
}
It compiles, links and runs fine for gcc 13.1 but clang 16.0 gives a linker error:
[...]: relocation R_X86_64_32S against `.rodata.cst32' can not be used when making a PIE object; recompile with -fPIE
If I try the same code locally and compile it with g++ -O3 main.cpp
(same gcc version) I get the same error above. (Recompiling with -fPIE
doesn't fix.)
It's worth noticing that the code generated by gcc for f()
and g()
are identical:
_Z1fv:
xor %rax, %rax
movq _ZL1x(,%rax,8), %rax
ret
_Z1gv:
xor %rax, %rax
movq _ZL1x(,%rax,8), %rax
ret
and if I remove f
from the source file and use g
in main
, then all compilers are happy, locally and on godbolt.
I understand that directly writing _ZL1x
in my inline assembly is weird and very likely to be a bad practice but that's not the point here. I would like to know why gcc is happy on godbolt and not locally and, more importantly, how could I make it to work locally and (if possible) for clang as well.
Most Linux distros configure GCC and Clang with -fPIE -pie -fstack-protector-strong
on by default, and perhaps -fno-plt
. So they're making Position-Independent Executables that have to be relocatable to anywhere in 64-bit address-space.
Matt Godbolt's Compiler Explorer has compilers installed with their vanilla configuration, with none of those options on by default, so they're making traditional executables that get linked to a specific address in the low 31 bits of virtual address space. (Except x86-64 Clang 16.0 which has at least -fPIE
enabled. This is purely up to the sysadmins of https://godbolt.org/ using the default config options when building compilers from source.)
Any addressing mode other than symbol(%rip)
uses at most a 32-bit sign-extended displacement1, so an absolute address has to fit into that. That only works in a Linux non-PIE executable, and I think a Windows non-LargeAddressAware executable or dll. Not in MachO64 no matter how you link it.
If you locally did gcc -O3 -S
, you could see the PIE-compatible asm it made for your g()
function, using a RIP-relative LEA first.
See
mov $x, %edi
when they want an address in RDI.movq x(%[tmp],%[i],8), %[i])
. That Q&A uses NASM syntax, not GAS AT&T or Intel.(Recompiling with
-fPIE
doesn't fix.)
That error message is assuming the machine code was generated by a compiler from a .c
, not just assembled from hand-written asm. With hand-written asm, there is no compilation step, only assembling. Or, you the human are the "compiler", so you need to position-independent asm to implement the algorithm that exists in your brain. (64-bit absolute addresses are also allowed, with the dynamic linker applying a text-relocation fixup on load, but RIP-relative addressing is efficient.)
Footnote 1: Except the special 64-bit absolute addressing mode for load/store of the accumulator, like movabs foo, %eax
, which is less compact than RIP-relative so you don't want it if you're building normal code where static data is within +-2GiB of code. There's a reason compilers don't use it.