I’m trying to understand the nuances of the restrict
keyword in C, particularly how its behavior might differ when applied to a pointer to a structure versus a pointer to a primitive type like int
.
Are there any specific optimization considerations when using restrict
with pointers to structures?
Does the compiler handle aliasing differently for structures compared to primitive types when restrict
is used?
Are there any potential pitfalls or edge cases to be aware of when applying restrict
to structure pointers?
I’m looking for insights into how restrict
interacts with structures and whether there are any differences in behavior or performance compared to primitive types.
All 3 questions pretty much boil down to the same special rule: a pointer to struct may alias with a pointer to one of the types that is also a member of that struct. Meaning that restrict
might improve performance a tiny bit, as the compiler no longer needs to assume that a pointer to int
might modify some int
member of the struct.
Example:
typedef struct
{
int a;
double b;
} struct_t;
void func (struct_t* s, int* i)
{
s->a++;
*i = 123;
printf("%d\n", s->a);
}
On clang 20.1 x86 -O3 I get this code generation:
func:
inc dword ptr [rdi]
mov dword ptr [rsi], 123
mov esi, dword ptr [rdi]
lea rdi, [rip + .L.str]
xor eax, eax
jmp printf@PLT
As in, a
is incremented by 1 in-place, then later after *i
has been modified, it reloads a
from memory esi, dword ptr [rdi]
. Because it can't assume that writing to *i
didn't change a
.
After changing to struct_t* restrict s
:
func:
mov eax, dword ptr [rdi]
inc eax
mov dword ptr [rdi], eax
mov dword ptr [rsi], 123
lea rdi, [rip + .L.str]
mov esi, eax
xor eax, eax
jmp printf@PLT
This isn't obviously much faster, but now the inc
is carried out on a register instead of a read-modify-write in RAM. The compiler doesn't have to assume that the 123
write affected a
. It stores the new value of a
back in RAM before writing 123
to i
. And then later uses the new value of a
still kept in a register to pass to printf mov esi, eax
, without reloading it from RAM.