Consider this C (not C++!) code:
int g();
int f() {
return g();
}
Clang (with any optimization level above zero) compiles this to:
f:
xor eax, eax
jmp g@PLT
I am trying to understand the reason for zeroing eax before jumping to g.
Various answers on Stack Overflow, especially this one, have led me to conclude that the number of parameters expected by g is not specified (to specify it as 0, we should have written int g(void);), and therefore that the compiler must assume that g can be variadic. But variadic functions require this register to contain the number of variables passed in vector registers, which is why clang needs to zero it (no vector registers are used here).
This all makes sense, but then I'm confused about why gcc doesn't zero the register:
"f":
jmp "g"
If zeroing this register is necessary per the ABI, then can we use this behavior of gcc to cause a miscompilation? Alternatively, if it's not necessary, then isn't clang wasting an instruction?
GCC15+ defaults to -std=gnu23 where int g(); means int g(void);
With -std=gnu17 or earlier, it will xor eax,eax.
Clang 11+ through current trunk defaults to -std=gnu17 where int g(); isn't a prototype and leaves the arguments unspecified. The current nightly built still defaults to that C version, but with -std=gnu23 it emits the same asm as GCC.
You can check by including int ver = __STDC_VERSION__; (at global scope) in your program. "ver": .long 202311 vs. 201710.
Godbolt with GCC and Clang each with -std=gnu17 and -std=gnu23, showing that's what makes the difference.
You're using GCC16 (nightly build); we can tell since it puts symbol names in quotes (e.g. jmp "g") in Intel syntax so it won't face-plant on a C function whose name is also a register name, which would compile to jmp rax which would assemble like AT&T jmp *%rax. In previous GCC versions, Intel-syntax was very much a second-class citizen, nice for human reading but not fully robust for production use.
Your GCC version defaults to -std=gnu23.
@Joshua is correct that this would be a valid optimization even in ISO C17 for an unprototyped function when the call-site passes no args. It's UB if the passed args don't match the callee's declaration, and variadic functions require at least one fixed arg. So a call with zero args can only be to a function that never accepts any args, and thus isn't variadic and doesn't need AL=0.
(Or is it actually not UB if the callee doesn't touch those args on the path of execution actually taken? If that's the case, this argument doesn't fully hold.)
But that only applies if the callee is also written in C, which is not necessarily true. GCC and Clang aim to be useful for systems programming where the callee could be written in assembly language, or compiler-generated from some other language.
If you properly prototype your functions, like writing int g(void); when that's what you mean, there's no need for the compiler to aggressively optimize.
Fun fact: in practice with variadic callees generated by modern compilers, AL=0 or non-zero is all they check for. And non-zero will simply result in dumping all 8 XMM regs to a stack array, even if the value is > 8 in violation of the ABI.
But it will actually break in variadic callees generated by older compilers which used AL for a computed jump into a sequence of movaps instructions. Why does printf still work with RAX lower than the number of FP args in XMM registers? compares GCC4.5 (computed jump) vs. more recent GCC (test/jz conditional branch all-or-nothing).
With variadic callees generated by more modern compilers with the all-or-nothing conditional branch, leaving non-zero garbage in AL just costs performance if there weren't actually any variadic FP args. Violating the ABI won't cause a correctness problem. But binary libraries compiled by older or different compilers, or hand-written in asm, are supposed to inter-operate with modern compiled code that still follows the same ABI.