My understanding is that RIP-relative addressing should work for offsets up to 2GB in size, but for some reason GCC (14.2) and Clang (19.1.0) stop using it when grabbing values more than 16MB away.
Given this code:
const int size = 1 << 22;
int small_array[size];
// 6 byte mov
int load_small_arry() {
return small_array[(sizeof(small_array)/sizeof(int)-1)];
}
int big_array[size + 1];
// 5 byte mov + 6 byte mov in clang
// 9 byte mov in gcc
int load_big_arry() {
return big_array[(sizeof(big_array)/sizeof(int)-1)];
}
I get this assembly from GCC (see Clang results in godbolt link, different but still switches away from RIP-relative):
load_small_arry():
mov eax,DWORD PTR [rip+0x0] # 6 <load_small_arry()+0x6>
R_X86_64_PC32 small_array+0xfffff8
ret
nop WORD PTR [rax+rax*1+0x0]
load_big_arry():
movabs eax,ds:0x0
R_X86_64_64 big_array+0x1000000
ret
This is a larger encoding so I'm not sure why it would be preferred.
The relevant code in GCC is here. It seems it's not really specific to RIP-relative addressing. The more general rule is that GCC assumes a value of the form static_label + constant_offset
is encodable as a signed 32-bit immediate only when constant_offset < 16MB
. There's a comment:
For CM_SMALL assume that latest object is 16MB before end of 31bits boundary.
It looks like the idea is that they want to support the use of pointers like static_label + constant_offset
even when the result exceeds the 2 GB limit. In the small code model, static_label
is known to be within that limit, and they assume further that it's at least 16 MB from the end. But if constant_offset
is larger than 16 MB, they no longer trust that the result will fit in a signed 32-bit immediate, and fall back to code that doesn't need it to.
I was originally thinking that this situation couldn't arise in well-defined ISO C or C++ code, because you're only allowed to do pointer arithmetic within a single array, and if the array is static, then all of it fits within 2 GB. So I thought maybe this code provided some sort of extension for compatibility, or for other language front-ends.
But actually, it can arise even in well-defined C, because it is fine to compile code which would access an array out-of-bounds, as long as you do not actually execute it. And the compiler may not be able to tell at compile time which is which.
Consider a program like:
file1.c
#ifdef BE_HUGE
char arr[2000000000];
const bool is_huge = true;
#else
char other_stuff[2000000000];
char arr[3];
const bool is_huge = false;
#endif
file2.c
extern char arr[];
extern const bool is_huge;
char foo(void) {
return is_huge ? arr[1999999999] : -1;
}
There's nothing illegal about this code. But the compiler can't safely emit mov al, [rip+arr+1999999999]
in foo()
. It would be fine if we are in the BE_HUGE
case, because then arr+1999999999
won't overflow 2G. But in !BE_HUGE
, it might. In that case the instruction won't actually ever be executed, but it still has to link successfully.
In compiling file2.c
, the compiler doesn't know which case we are in, so it needs to generate code that runs correctly in one case and still links in the other, and that prevents the use of the narrower addressing mode.