Currently I'm looking at a situation where an application gets stuck while two distinct instances of std::unique_lock
in two separate threads simultaneously "own" a lock on the same std::mutex
. Owning in this case means that owns_lock()
returns true
for both locks. I'm trying to make sense of this situation, which as far as I understand should not be possible. Upon inspection with gdb I found that while _M_owns
property is true
for both locks, simultaneously the __lock
, __count
and __owner
properties of the associated std::mutex
are zero. To give a clear picture I'll go through the situation with gdb step-by-step:
We have two threads, thread 1 and thread 33.
(gdb) info thread
Id Target Id Frame
* 1 Thread 0x7ffff7e897c0 (LWP 66233) "tests" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
33 Thread 0x7fffe5ee0640 (LWP 66269) "tests" __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
Thread 1 and thread 33 each have their own instance l
of std::unique_lock
(0x7fffffffd240
, 0x7fffe5edf8c0
) and access to class members vec
(std::vector
, 0x612000002fb0
), m (std::mutex
, 0x612000002ed8
), cv
(std::condition_variable
, 0x612000002f00
) which are shared between both threads.
(gdb) select 5
(gdb) p &m
$14 = (std::mutex *) 0x612000002ed8
(gdb) p &cv
$15 = (std::condition_variable *) 0x612000002f00
(gdb) p &vec
$16 = (std::__debug::vector<std::shared_future<void>, std::allocator<std::shared_future<void> > > *) 0x612000002fb0
(gdb) p &l
$17 = (std::unique_lock<std::mutex> *) 0x7fffffffd240
(gdb) t 33
[Switching to thread 33 (Thread 0x7fffe5ee0640 (LWP 67477))]
#0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57 in ./nptl/futex-internal.c
(gdb) select 5
(gdb) p &m
$18 = (std::mutex *) 0x612000002ed8
(gdb) p &cv
$19 = (std::condition_variable *) 0x612000002f00
(gdb) p &vec
$20 = (std::__debug::vector<std::shared_future<void>, std::allocator<std::shared_future<void> > > *) 0x612000002fb0
(gdb) p &l
$21 = (std::unique_lock<std::mutex> *) 0x7fffe5edf8c0
Thread 1 is stuck on this line:
94 cv.wait(l);
and should proceed once it acquired the lock. Thread 2 is stuck on this line:
54 cv.wait(l);
and also should proceed once it acquired the lock. So let's take a look at who currently owns the lock starting with thread 1.
(gdb) t 1
[Switching to thread 1 (Thread 0x7ffff7e897c0 (LWP 67445))]
#0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57 ./nptl/futex-internal.c: No such file or directory.
(gdb) select 5
(gdb) p l
$22 = {_M_device = 0x612000002ed8, _M_owns = true}
(gdb) p m
$23 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 2, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 12 times>, "\002", '\000' <repeats 26 times>, __align = 0}}, <No data fields>}
Note how l._M_owns = true
while the std::mutex m
, despite having two __nusers
, does not seem to be locked.
Now looking at thread 33 we see the same situation:
(gdb) t 33
[Switching to thread 33 (Thread 0x7fffe5ee0640 (LWP 67477))]
#0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x612000002f28) at ./nptl/futex-internal.c:57
57 in ./nptl/futex-internal.c
(gdb) select 5
(gdb) p l
$24 = {_M_device = 0x612000002ed8, _M_owns = true}
(gdb) p m
$25 = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 2, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
__size = '\000' <repeats 12 times>, "\002", '\000' <repeats 26 times>, __align = 0}}, <No data fields>}
Can anybody explain how both instances of std::unique_lock
can simultaneously "own" the lock (i.e. l.owns_lock() == true
) while the associated std::mutex
in fact seems to be not locked at all?
I expect that the _M_owns
variable is simply not updated during cv.wait
because that would be a waste of time because the program can't see it get updated because it's waiting. The standard library isn't required to help you cheat with the debugger.
With that in mind, it seems probable that both threads are waiting for the condition variable and the mutex is not locked.