I'm writing C code for an embedded system. In this system, there are memory mapped registers at some fixed address in the memory map, and of course some RAM where my data segment / heap is.
I'm finding problems generating optimal code when my code is intermixing accesses to global variables in the data segment and accesses to hardware registers. This is a simplified snippet:
#include <stdint.h>
uint32_t * const restrict HWREGS = 0x20000;
struct {
uint32_t a, b;
} Context;
void example(void) {
Context.a = 123;
HWREGS[0x1234] = 5;
Context.b = Context.a;
}
This is the code generated on x86 (see also on godbolt):
example:
mov DWORD PTR Context[rip], 123
mov DWORD PTR ds:149712, 5
mov eax, DWORD PTR Context[rip]
mov DWORD PTR Context[rip+4], eax
ret
As you can see, after having written the hardware register, Context.a
is reloaded from RAM before being stored into Context.b
. This doesn't make sense because Context
is at a different memory address than HWREGS
. In other words, the memory pointed by HWREGS
and the memory pointed by &Context
do not alias, but it looks like there is not way to tell that to the compiler.
If I change HWREGS
definition as this:
extern uint32_t * const restrict HWREGS;
that is, I hide the fixed memory address to the compiler, I get this:
example:
mov rax, QWORD PTR HWREGS[rip]
mov DWORD PTR [rax+18640], 5
movabs rax, 528280977531
mov QWORD PTR Context[rip], rax
ret
Context:
.zero 8
Now the two writes to Context are optimized (even coalesced to a single write), but on the other hand the access to the hardware register does not happen anymore with a direct memory access but it goes through a pointer indirection.
Is there a way to obtain optimal code here? I would like GCC to know that HWREGS is at a fixed memory address and at the same time to tell it that it does not alias Context
.
If you want to avoid compilers reloading regularly values from a memory region (possibly due to aliasing), then the best is not to use global variables, or at least not to use direct accesses to global variables. The register
keyword seems ignored for global variables (especially here on HWREGS
) for both GCC and Clang. Using the restrict
keyword on function parameters solves this problem:
#include <stdint.h>
uint32_t * const HWREGS = 0x20000;
struct Context {
uint32_t a, b;
} context;
static inline void exampleWithLocals(uint32_t* restrict localRegs, struct Context* restrict localContext) {
localContext->a = 123;
localRegs[0x1234] = 5;
localContext->b = localContext->a;
}
void example() {
exampleWithLocals(HWREGS, &context);
}
Here is the result (see also on godbolt):
example:
movabs rax, 528280977531
mov DWORD PTR ds:149712, 5
mov QWORD PTR context[rip], rax
ret
context:
.zero 8
Please note that the strict aliasing rule do not help in this case since the type of read/written variables/fields is always uint32_t
.
Besides this, based on its name, the variable HWREGS
looks like a hardware register. Please note that it should be put volatile
so that compiler do not keep it to registers nor perform any similar optimization (like assuming the pointed value is left unchanged if the code do not change it).