c++linuxgccpthreadsgold-linker

undefined behavior in shared lib using libpthread, but not having it in ELF as dependency


When linked "properly" (explained further), both function calls below block indefinitely on pthread calls implementing cv.notify_one and cv.wait_for:

// let's call it odr.cpp, which forms libodr.so

std::mutex mtx;
std::condition_variable cv;
bool ready = false;

void Notify() {
  std::chrono::milliseconds(100);
  std::unique_lock<std::mutex> lock(mtx);
  ready = true;
  cv.notify_one();
}

void Get() {
  std::unique_lock<std::mutex> lock(mtx);
  cv.wait_for(lock, std::chrono::milliseconds(300));
}

when shared library above is used in following application:

// let's call it test.cpp, which forms a.out

int main() {
  std::thread thr([&]() {
    std::cout << "Notify\n";
    Notify();
  });

  std::cout << "Before Get\n";
  Get();
  std::cout << "After Get\n";

  thr.join();
}

Problem reproduces only when linking libodr.so:

with following versions of relevant tools:

so that we end up with:

as shown here:

$ g++ -fPIC -shared -o build/libodr.so build/odr.cpp.o -fuse-ld=gold -lpthread && readelf -d build/libodr.so | grep Shared && readelf -Ws build/libodr.so | grep -m1 __pthread_key_create
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
    10: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __pthread_key_create

On the other hand, with any of the following we experience no bug:

note: this time we have either:

as shown here:

$ clang++ -fPIC -shared -o build/libodr.so build/odr.cpp.o -fuse-ld=gold -lpthread && readelf -d build/libodr.so | grep Shared && readelf -Ws build/libodr.so | grep -m1 __pthread_key_create && ./a.out 
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
    24: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __pthread_key_create@GLIBC_2.2.5 (7)

$ g++ -fPIC -shared -o build/libodr.so build/odr.cpp.o -fuse-ld=bfd -lpthread && readelf -d build/libodr.so | grep Shared && readelf -Ws build/libodr.so | grep -m1 __pthread_key_create && ./a.out 
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
    14: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __pthread_key_create

$ g++ -fPIC -shared -o build/libodr.so build/odr.cpp.o -fuse-ld=gold && readelf -d build/libodr.so | grep Shared && readelf -Ws build/libodr.so | grep -m1 __pthread_key_create && ./a.out  0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
    18: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __pthread_key_create

$ g++ -fPIC -shared -o build/libodr.so build/odr.cpp.o -fuse-ld=gold -Wl,--no-as-needed -lpthread && readelf -d build/libodr.so | grep Shared && readelf -Ws build/libodr.so | grep -m1 __pthread_key_create && ./a.out 
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
    10: 0000000000000000     0 FUNC    WEAK   DEFAULT  UND __pthread_key_create@GLIBC_2.2.5 (4)

Complete example to compile/run can be found here: https://github.com/aurzenligl/study/tree/master/cpp-pthread

What breaks shlib using pthread when __pthread_key_create is WEAK and no libpthread.so dependency in ELF can be found? Does the dynamic linker take the pthread symbols from libc.so (stubs) instead of libpthread.so?


Solution

  • There's a lot happening here: differences between gcc and clang, differences between gnu ld and gold, the --as-needed linker flag, two different failure modes, and maybe even some timing issues.

    Let's start with how to link a program using POSIX threads.

    The compiler's -pthread flag is all you should need. It's a compiler flag, so you should use it both when compiling code that uses threads and when linking the final executable. When you use -pthread on the link step, the compiler will provide the -lpthread flag automatically, and in the right place in the link line.

    Typically, you would only use it when linking the final executable, and not when linking a shared library. If you simply want to make your library thread safe, but don't want to force every program that uses your library to link with pthreads, you'd want to use a runtime check to see if the pthreads library is loaded, and call the pthread APIs only if it is. On Linux, this is typically done by checking a "canary" -- for example, make a weak reference to an arbitrary symbol like __pthread_key_create, which will only be defined if the library is loaded, and will have the value 0 if the program was linked without it.

    In your case, however, your library libodr.so pretty much depends on threads, so it's reasonable to link it with the -pthread flag.

    That brings us to the first failure mode: if you use g++ and gold for both link steps, the program throws std::system_error and says you need to enable multithreading. This is due to the --as-needed flag. GCC passes --as-needed to the linker by default, while clang (apparently) does not. With --as-needed, the linker will only record library dependencies that resolve a strong reference. Since all the references to pthread APIs are weak, none of them are sufficient to tell the linker that libpthread.so should be added to the dependency list (via a DT_NEEDED entry in the dynamic table). Changing to clang or adding a -Wl,--no-as-needed flag solves this problem, and the program will load the pthread library.

    But, wait, why don't you need to do this when using the Gnu linker? It uses the same rule: only a strong reference causes the library to be recorded as a dependency. The difference is that Gnu ld also considers references from other shared libraries, while gold only considers references from regular object files. It turns out that the pthread library provides overriding definitions of several libc symbols, and there are strong references from libstdc++.so to some of those symbols (e.g., write). Those strong references are enough to get Gnu ld to record libpthread.so as a dependency. This is more of an accident than design; I don't think changing gold to consider references from other shared libraries would actually be a robust fix. I think the proper solution is for GCC to put --no-as-needed in front of the -lpthread flag when you use -pthread.

    This begs the question of why this issue doesn't come up all the time when using POSIX threads and the gold linker. But this is a small test program; a larger program is almost certain to contain strong references to some of those libc symbols that libpthread.so overrides.

    Now let's look at the second failure mode, where both Notify() and Get() block indefinitely if you link libodr.so with g++, gold and -lpthread.

    In Notify(), you're holding the lock through the end of the function, while you call cv.notify_one(). You really only need to hold the lock to set the ready flag; if we change it so that we release the lock before that, then the thread calling Get() will timeout after 300 ms, and does not block. So it's really the call to notify_one() that's blocking, and the program is deadlocking because Get() is waiting on that same lock.

    So why does it block only when __pthread_key_create is FUNC instead of NOTYPE? I think the type of the symbol is a red herring, and that the real problem is caused by the fact that gold doesn't record the symbol versions for references resolved by a library that isn't added as a needed library. The implementation of wait_for calls pthread_cond_timedwait, which has two versions in both libpthread and libc. It's possible that the loader is binding the reference to the wrong version, causing a deadlock by failing to unlock the mutex. I made a temporary patch to gold to record those versions, and that made the program work. Unfortunately, that's not a solution, as that patch can cause ld.so to crash under other circumstances.

    I tried changing cv.wait_for(...) to cv.wait(lock, []{ return ready; }), and the program runs perfectly in all scenarios, which further suggests that the problem is with pthread_cond_timedwait.

    The bottom line is that adding the --no-as-needed flag will fix the problem for this very small test case. Anything larger is likely to work without the extra flag, as you'll be increasing the odds of making a strong reference to a symbol in libpthread. (For example, adding a call to std::this_thread::sleep_for anywhere in odr.cpp adds a strong reference to nanosleep, which puts libpthread in the needed list.)

    Update: I've verified that the failing program is linking to the wrong version of pthread_cond_timedwait. For glibc 2.3.2, the pthread_cond_t type was changed, and the old versions of the APIs that use the type were changed to dynamically allocate a new (bigger) structure and store a pointer to it in the original type. So now, if the consuming thread reaches cv.wait_for before the producing thread reaches cv.notify_one, the implementation of cv.wait_for calls the old version of pthread_cond_timedwait, which initializes what it thinks is an old pthread_cond_t in cv with a pointer to a new pthread_cond_t. After that, when the other thread reaches cv.notify_one, its implementation assumes that cv contains a new-style pthread_cond_t rather than a pointer to one, so it calls pthread_mutex_lock with the pointer to the new pthread_cond_t instead of the pointer to the mutex. It locks that would-be mutex, but it never gets unlocked because the other thread unlocks the real mutex.