By looking at the Peterson's algorithm, the equivalent implementation in C++ is:
#include <thread>
#include <atomic>
int main(){
std::atomic<bool> flag[2] = {false,false};
std::atomic<int> turn = {0};
std::thread t1([&](){
flag[0] = true; // flag_set_0
turn = 1; // turn_set_0
while(flag[1] && turn == 1){ // flag_read_0, turn_read_0
//busy wait
}
// critical section
flag[0] = false;
});
std::thread t2([&](){
flag[1] = true; // flag_set_1
turn = 0; // turn_set_1
while(flag[0] && turn == 0){ // flag_read_1, turn_read_1
//busy wait
}
// critical section
flag[1] = false;
});
t1.join();
t2.join();
}
In this program, if the execution order is:
flag[0] = true
turn = 1
flag[1] = true
while(flag[1] && turn == 1){} // true
turn = 0
while(flag[0] && turn == 0){} //true
The single total order is
flag_set_0 < turn_set_0 < flag_set_1 < flag_read_0 < turn_read_0 < turn_set_1 < flag_read_1 < turn_read_1
,
which is a valid single total order without contradiction since it obeys the constraints required by t1
and t2
:
t1: flag_set_0 < turn_set_0 < flag_read_0 < turn_read_0
t2: flag_set_1 < turn_set_1 < flag_read_1 < turn_read_1
This means the program can have both threads simultaneously enter the busy-wait loop and two infinite loops make the program unterminated, is my understanding right?
Yes, they can both enter. But one of them entered when turn
was different, before the last store to it. On the next iteration its loop condition will be false. This isn't a deadlock.
The spin loops check both flag[other_thread]
and turn
every iteration, with opposite conditions for turn
that can't be true for both loops indefinitely.
Yes, I see your meaning, the only possibility is that the implementation does not obey [atomics.order] p11(should within a reasonable amount of time), such that it always sees turn conform to the condition even if the other thread has changed it. But it is not the fault of the algorithm, it's the fault of the implementations, right? In this situation, all the operations in the condition would precede the change to the turn in the single total order
S
, right?
These are seq_cst
operations so I'm not sure if even delayed visibility is possible after both loops have entered their spin loops. That involves them both having read turn after writing it.
Maybe it's just an implementation detail that real hardware has to wait for a seq_cst
store to become globally visible before a seq_cst
load can reload it (or actually for all previous seq_cst stores to become visible). But if not, then the last turn=
assignment is already visible to the other thread by the time the second thread enters its spin-loop. Maybe no seq_cst
requirements would be violated if both threads read opposite turn
values for multiple iterations in this case; if so, then yes only the "should" visibility promptness requirements prevent a deadlock.
If you think a well-known algorithm that's been known for decades is broken, it's highly likely that you've made a mistake and should keep looking for a reason why it's not broken. (Or if you need to ask for help understanding it, at least phrase your question with the assumption that the algorithm is correct. e.g. "How does Peterson's algorithm avoid a deadlock with both threads entering the spin loops?" Phrasing it that way might prompt you to look at the loop conditions again, or not.)