I wrote a small test project to see if std::call_once blocks while executing callable. Output of the project allows to assume that call_once has 2 behaviours: it blocks on detached threads and does not on joined. I strongly suspect that it can not be true, but there is no other conclusion I can make, please guide me to the correct one.
using namespace std;
once_flag f;
mutex cout_sync;
void my_pause()
{
volatile int x = 0;
for(int i=0; i<2'000'000'000; ++i) { x++; }
}
void thr(int id)
{
auto start = chrono::system_clock::now();
call_once(f, my_pause);
auto end = chrono::system_clock::now();
scoped_lock l{cout_sync};
cout << "Thread " << id << " finished in " << (static_cast<chrono::duration<double>>(end-start)).count() << " sec" << endl;
}
int main()
{
vector<thread> threads;
for(int i=0; i<4; i++)
{
threads.emplace_back(thr, i);
threads.back().join();
}
return 0;
}
Output:
Thread 0 finished in 4.05423 sec
Thread 1 finished in 0 sec
Thread 2 finished in 0 sec
Thread 3 finished in 0 sec
Changing threads to detached:
for(int i=0; i<4; i++)
{
threads.emplace_back(thr, i);
threads.back().detach();
}
this_thread::sleep_for(chrono::seconds(5));
Output:
Thread 0 finished in 4.08223 sec
Thread 1 finished in 4.08223 sec
Thread 3 finished in 4.08123 sec
Thread 2 finished in 4.08123 sec
Visual Studio 2017
It is in fact related to the fact that you join the thread first, before starting the next thread, in the joined version.
These semantics are triggered because of the specification of call_once:
If that invocation throws an exception, it is propagated to the caller of call_once, and the flag is not flipped so that another call will be attempted (such call to call_once is known as exceptional).
This means that if the call_once'd function throws an exception, it is not considered to be called, and the next call to call_once will invoke the function again.
This means that the entire call_once() is effectively protected by an internal mutex. If a call_once-d function is being executed, any other thread that enters call_once() must be blocked, until the call_once-d function returns.
You join the threads one at a time, so the 2nd thread doesn't get called until call_once already returned, in the first thread.
You start all four detached threads effectively at the same time. Effectively, all four threads will enter call_once approximately together.
One of those threads will end up executing the called function.
The other threads will be blocked until the called function either returns, or throws an exception.
This effectively means that all threads will have to wait.
This has nothing to do with detached threads.
If you change the first version of the code to start all four threads first, and then join them all, you'll see the same behavior.