c++multithreadingsingleton

thread_local singleton performs lazy initialization by default?


I have the following code of a thread_local singleton:

struct Singl {
  int& ref;

  Singl(int& r) : ref(r) {}
  ~Singl() {}

  void print() { std::cout << &ref << std::endl; }
};

static auto& singl(int& r) {
  static thread_local Singl i(r);
  return i;
}

int main() {
    int x = 4;
    singl(x).print();

    int y = 55;
    singl(y).print();

  return 0;
}

This program prints twice the reference to x.

The compiler (gcc 8.1 on godbolt) seems to do a lazy initialization of the singleton object:

singl(int&):
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR fs:0
        add     rax, OFFSET FLAT:guard variable for singl(int&)::i@tpoff
        movzx   eax, BYTE PTR [rax]
        test    al, al
        jne     .L5
        mov     rax, QWORD PTR [rbp-8]
        mov     rdx, QWORD PTR fs:0
        add     rdx, OFFSET FLAT:singl(int&)::i@tpoff
        mov     rsi, rax
        mov     rdi, rdx
        call    Singl::Singl(int&)
        mov     rax, QWORD PTR fs:0
        add     rax, OFFSET FLAT:guard variable for singl(int&)::i@tpoff
        mov     BYTE PTR [rax], 1
        mov     rax, QWORD PTR fs:0
        add     rax, OFFSET FLAT:singl(int&)::i@tpoff
        mov     edx, OFFSET FLAT:__dso_handle
        mov     rsi, rax
        mov     edi, OFFSET FLAT:_ZN5SinglD1Ev
        call    __cxa_thread_atexit
.L5:
        mov     rax, QWORD PTR fs:0
        add     rax, OFFSET FLAT:singl(int&)::i@tpoff
        leave
        ret

Is this the default behaviour I can expect whenever I make multiple calls to the singl-function passing different arguments? Or is it possible that the singleton object might be initialized a second time on a subsequent call?


Solution

  • This is indeed guaranteed. static/thread_local local variables are initialized exactly once, when control reaches the declaration.

    A few points to take note of:

    1. For static but not thread_local variables, if multiple threads are calling the function concurrently, only one will perform the initialization and the others will wait.

    2. If the initialization throws an exception, it is considered incomplete, and will be performed again the next time control reaches it. This is what the guard variables are doing in the disassembly.

    In other words, they just work™.