I'm trying to compile the following test code below, that only writes the 32-bits variable into a pointer. I write it once as byte access, and second time as word access.
void load_data_8(uint32_t value, void* d) {
uint8_t* d_ptr = d;
*d_ptr++ = (value>>0)&0xFF;
*d_ptr++ = (value>>8)&0xFF;
*d_ptr++ = (value>>16)&0xFF;
*d_ptr++ = (value>>24)&0xFF;
*d_ptr++ = (value>>24)&0xFF;
*d_ptr++ = (value>>16)&0xFF;
*d_ptr++ = (value>>8)&0xFF;
*d_ptr++ = (value>>0)&0xFF;
}
void load_data_32(uint32_t value, void* d) {
uint32_t* d_ptr = d;
*d_ptr = value;
}
Compiler: ARM GCC 11.2.1
Compiler flags: -mcpu=cortex-m7 -O3
(C-M7 has unaligned memory access instructions)
Compiler produces the following:
load_data_8:
rev r3, r0
str r0, [r1] @ unaligned
str r3, [r1, #4] @ unaligned
bx lr
load_data_32:
str r0, [r1]
bx lr
main:
movs r0, #0
bx lr
And if I compile the same code for cortex-m0plus
, which has even less capabilities for unaligned memory access, I get this:
Compiler flags: -mcpu=cortex-m0plus -O3
load_data_8:
push {r4, lr}
lsrs r3, r0, #8
lsrs r2, r0, #16
uxtb r4, r0
uxtb r3, r3
uxtb r2, r2
lsrs r0, r0, #24
strb r4, [r1]
strb r3, [r1, #1]
strb r2, [r1, #2]
strb r0, [r1, #3]
strb r0, [r1, #4]
strb r2, [r1, #5]
strb r3, [r1, #6]
strb r4, [r1, #7]
pop {r4, pc}
load_data_32:
str r0, [r1]
bx lr
C-M7 test: What is the reason for @ unaligned
message in the load_data_8
function for Cortex-M7, but not in the load_data_32
? How does compiler know that data pointer in the load_data_32
won't be unaligned?
C-M0+ test: Why it does not produce the same code for load_data_8
and load_data_32
, given in both cases we write 32-bits
of data in a CPU endianness (little)? What makes it different from core standpoint if the type is 8-bit vs 32-bit, given that memory is in a sequence?
Both questions have the same answer: when you convert a void *
to a uint32_t *
, the compiler is allowed to assume that the pointer you converted was already properly aligned for uint32_t
(i.e. to 4 bytes, on this platform). Thus in load_data_32
, you get a single word-size str
. On the M7, the compiler doesn't annotate it as unaligned because it assumes it is aligned. And on M0+, it can emit an instruction that actually requires alignment.
So it is up to you to ensure that the void *
pointer passed to load_data_32
actually is aligned to 4 bytes. If it isn't, then according to the C standard, the behavior is undefined. In this particular instance, the M7 code will work as expected, and the M0+ code will fault.
In other words, the compiler knows that the pointer is aligned because under the rules of the language, you, the programmer, implicitly promised that it would be (though perhaps you didn't realize you were making such a promise). That's a binding contract and the compiler can hold you to it, on penalty of undefined behavior.
In load_data_8
, you convert the void *
to uint8_t *
. The compiler can thus only assume that it is aligned properly for uint8_t
, meaning, on this platform, no particular alignment (1 byte). On Cortex M7, it knows that 32-bit str
can still be used, but annotates it as unaligned just to make the programmer and/or compiler developer aware of this. On Cortex M0+, since 32-bit str
doesn't work for unaligned pointers, it has to emit a longer sequence of strb
.
Actually, according to the strict aliasing rule (oversimplified), the void *
passed to load_data_32
is essentially required to have been the result of converting a pointer to an actual uint32_t
object. void *
in modern C isn't meant as a tool for arbitrary type punning (accessing chunks of memory as one type, then as another). Rather, it lets you bypass type checking so that a single pointer object could be used to hold a pointer to any of several different types - but it's up to your program's logic to know what that type actually was, and ensure that it gets converted back to the same type before dereferencing. (There is an exception for character types so that things like memcpy
can be written generically.)