Compilers are getting smarter and smarter these days. So, is volatile
needed?
Compilers are smart enough to ask if we still need volatile
?
I tried many scenarios and it seemed like the compiler would do the optimizations correctly.
If your answer is NO, please give me some counterexamples, I am a C language novice.
For a global variable, I only find one case that compiler doesn't use memory value, as follow. Can you provide other examples?
int flag = 0;
while(flag == 0);
The smarter compilers are, the more important it is to use volatile
where it's needed, to stop compilers from optimizing in ways you don't want. Or it would be better to say that it's always critically important to use volatile
where necessary, but the chance of Bad Things actually happening in practice is higher with smarter compilers if you miss any cases that need it.
In a Linux kernel context specifically, see Who's afraid of a big bad optimizing compiler? for how why Linux's READ_ONCE
/ WRITE_ONCE
macros are important for lock-free atomics. Concurrent read+write from separate threads is undefined behaviour in ISO C so aggressive optimizations are allowed for normal code that doesn't use volatile
. GNU C plus the underlying hardware defines the behaviour of volatile
enough to get meaningful behaviour.
(See also MCU programming - C++ O2 optimization breaks while loop - most code shouldn't actually use volatile
for multithreading or interrupts. Use lock-free types from stdatomic.h
like _Atomic int
.)
Or a trivial example like writing to an MMIO register in a device driver:
static volatile uint32_t *const DEVICE_REGISTER = (void*)0x100000;
void write_two_things(){
*DEVICE_REGISTER = 0x1111;
*DEVICE_REGISTER = 0x2222;
}
Without volatile
, the first store would be optimized away so the PCI device would never see it. With plain variables, only the final value matters, so stores that get overwritten before anything could read them (without UB) might as well not have been done in the first place. This is called dead store elimination.
See it for yourself on Godbolt, compiling to x86-64 asm with GCC -O2
, plus with an alternate version not using volatile
.
write_two_things:
mov DWORD PTR ds:1048576, 4369 # 0x1111
mov DWORD PTR ds:1048576, 8738 # 0x2222
ret
write_two_things_buggy:
# first store was optimized away as being dead
mov DWORD PTR ds:2097152, 8738 # 0x2222
ret