On x86 machines, instructions like inc, addl are not atomic and under SMP environment it is not safe to use them without lock prefix. But under UP environment it is safe since inc, addl and other simple instructions won't be interrupted.
My problem is that, given a C-level statement like
x = x + 1;
Is there any guarantees that C compiler will always use UP-safe instructions like
incl %eax
but not those thread-unsafe instructions(like implementing the C statement in several instructions which may be interrupted by a context switch) even in a UP environment?
There is absolutely no guarantee that "x - x + 1"
will compile to interrupt-safe instructions on any platform, including x86. It may well be that it is safe for a specific compiler and specific processor architecture but it's not mandated in the standards at all and the standard is the only guarantee you get.
You can't consider anything to be safe based on what you think it will compile down to. Even if a specific compiler/architecture states that it is, relying on it is very bad since it reduces portability. Other compilers, architectures or even later versions on the same compiler and architecture can break your code quite easily.
It's quite feasible that x = x + 1
could compile to an arbitrary sequence such as:
load r0,[x] ; load memory into reg 0
incr r0 ; increment reg 0
stor [x],r0 ; store reg 0 back to memory
on a CPU that has no memory-increment instructions. Or it may be smart and compile it into:
lock ; disable task switching (interrupts)
load r0,[x] ; load memory into reg 0
incr r0 ; increment reg 0
stor [x],r0 ; store reg 0 back to memory
unlock ; enable task switching (interrupts)
where lock
disables and unlock
enables interrupts. But, even then, this may not be thread-safe in an architecture that has more than one of these CPUs sharing memory (the lock
may only disable interrupts for one CPU), as you've already stated.
The language itself (or libraries for it, if it's not built into the language) will provide thread-safe constructs and you should use those rather than depend on your understanding (or possibly misunderstanding) of what machine code will be generated.
Things like Java synchronized
and pthread_mutex_lock()
(available to C under some OS') are what you want to look into.