pythoncmultithreadinggdbfutex

Release Python Thread Lock or Futex Using GDB


I would like to find a way to release a Python thread Lock using GDB on Linux. I am using Ubuntu 18.04, Python 3.6.9, and gdb 8.1.1. I am also willing to use the gdb package in Python.

This is for personal research and not intended for a production system.

Suppose I have this Python script named "m4.py", which produces a deadlock:

import threading
import time
import os

lock1 = threading.Lock()
lock2 = threading.Lock()

def func1(name):
    print('Thread',name,'before acquire lock1')
    with lock1:
        print('Thread',name,'acquired lock1')
        time.sleep(0.3)
        print('Thread',name,'before acquire lock2')
        with lock2:
            print('Thread',name,'DEADLOCK: This line will never run.')

def func2(name):
    print('Thread',name,'before acquire lock2')
    with lock2:
        print('Thread',name,'acquired lock2')
        time.sleep(0.3)
        print('Thread',name,'before acquire lock1')
        with lock1:
            print('Thread',name,'DEADLOCK: This line will never run.')

if __name__ == '__main__':
    print(os.getpid())

    thread1 = threading.Thread(target=func1, args=['thread1',])
    thread2 = threading.Thread(target=func2, args=['thread2',])
    thread1.start()
    thread2.start()

My goal is to use gdb to release either lock1 or lock2 or both, so that the "DEADLOCK: This line will never run" message is displayed.

I think the first obstacle is that the program reaches the deadlock almost immediately, and there is not time to set a breakpoint in gdb. Is a breakpoint necessary?

Suppose I attach gdb by PID like this:

sudo gdb -p 121408

I can see that all threads are blocked with a futex.

(gdb) info threads
  Id   Target Id         Frame
* 1    Thread 0x7f56b324f740 (LWP 121408) "python3" 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f56ac000e70) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  2    Thread 0x7f56b1b8d700 (LWP 121409) "python3" 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1bc3fc0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
  3    Thread 0x7f56b138c700 (LWP 121410) "python3" 0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1bc3f90) at ../sysdeps/unix/sysv/linux/futex-internal.h:205

The top five frames of the backtrace show the C function calls.

(gdb) thread 1
[Switching to thread 1 (Thread 0x7f56b324f740 (LWP 121408))]
#0  0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f56ac000e70) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
205     in ../sysdeps/unix/sysv/linux/futex-internal.h
(gdb) bt
#0  0x00007f56b2a377c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f56ac000e70) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x7f56ac000e70, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f56b2a378b8 in __new_sem_wait_slow (sem=0x7f56ac000e70, abstime=0x0) at sem_waitcommon.c:181
#3  0x00000000005aac15 in PyThread_acquire_lock_timed () at ../Python/thread_pthread.h:386
#4  0x00000000004d0ade in acquire_timed (timeout=<optimized out>, lock=0x7f56ac000e70) at ../Modules/_threadmodule.c:68
#5  lock_PyThread_acquire_lock () at ../Modules/_threadmodule.c:151
#6  0x000000000050a335 in _PyCFunction_FastCallDict (kwargs=<optimized out>, nargs=<optimized out>, args=<optimized out>, func_obj=<built-in method acquire of _thread.lock object at remote 0x7f56b1c289e0>)
    at ../Objects/methodobject.c:231
#7  _PyCFunction_FastCallKeywords (kwnames=<optimized out>, nargs=<optimized out>, stack=<optimized out>, func=<optimized out>) at ../Objects/methodobject.c:294
#8  call_function.lto_priv () at ../Python/ceval.c:4851


Here are some of the things I have tried:

Return

"When you use return, GDB discards the selected stack frame (and all frames within it)". GDB

(gdb) return
Can not force return from an inlined function.

Access Python release function.

In this example, Frame 7 is the last frame where py-locals works. I tried accessing the release() method of Lock. As far as I know, it is not possible to invoke a method that is a member of a Python object.

(gdb) frame 7
#7  _PyCFunction_FastCallKeywords (kwnames=<optimized out>, nargs=<optimized out>, stack=<optimized out>, func=<optimized out>) at ../Objects/methodobject.c:294
294     in ../Objects/methodobject.c
(gdb) print lock
$7 = 0
(gdb) print lock.release
Attempt to extract a component of a value that is not a structure.

Interpret Lock as PyThread_type_lock

I am not sure that the interpreting the object as an opaque pointer is useful.

(gdb) print *((PyThread_type_lock *) 0x7f56ac000e70)
$8 = (PyThread_type_lock) 0x100000000

Call void PyThread_release_lock(PyThread_type_lock);

This attempt produces a segmentation fault.

(gdb) print (void)PyThread_release_lock (lock)

Thread 1 "python3" received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
The program being debugged was signaled while in a function called from GDB.
GDB remains in the frame where the signal was received.
To change this behavior use "set unwindonsignal on".
Evaluation of the expression containing the function
(PyThread_release_lock) will be abandoned.
When the function is done executing, GDB will silently stop.

Make System Call

I reran the script because the SIGSEV killed it. I then adapted code from this Gist Gist to make a syscall using the ctypes library in a Python script. In part, the code is this:

def _is_ctypes_obj_pointer(obj):
    return hasattr(obj, '_type_') and hasattr(obj, 'contents')

def _coerce_to_pointer(obj):
    print("obj", obj)
    if obj is None:
        return None

    if _is_ctypes_obj(obj):
        if _is_ctypes_obj_pointer(obj):
            return obj
        return ctypes.pointer(obj)

    return (obj[0].__class__ * len(obj))(*obj)


def _get_futex_syscall():
    futex_syscall = ctypes.CDLL(None, use_errno=True).syscall
    futex_syscall.argtypes = (ctypes.c_long, ctypes.c_void_p, ctypes.c_int,
                              ctypes.c_int, ctypes.POINTER(timespec),
                              ctypes.c_void_p, ctypes.c_int)
    futex_syscall.restype = ctypes.c_int
    futex_syscall_nr = ctypes.c_long(202)

    # pylint: disable=too-many-arguments
    def _futex_syscall(uaddr, futex_op, val, timeout, uaddr2, val3):
        uaddr = ctypes.c_int(uaddr)
        error = futex_syscall(
            futex_syscall_nr,
            _coerce_to_pointer(uaddr),
            ctypes.c_int(futex_op),
            ctypes.c_int(val),
            _coerce_to_pointer(timeout or timespec()),
            _coerce_to_pointer(ctypes.c_int(uaddr2)),
            ctypes.c_int(val3)
        )
        res2 =  error, (ctypes.get_errno() if error == -1 else 0)
        print(res2)
    #    _futex_syscall.__doc__ = getattr(futex, '__doc__', None)

    res =  _futex_syscall(0x7f5ca8000e70, 1, 99, 0, 0, 0)
    print(res)

I do not know whether it is possible to unlock a futex with GDB. If it is, I would like to understand how.


Solution

  • In a subsequent run, this procedure worked.

    Start gdb

    GDB attached to the process and put it in a paused state.

    sudo gdb -p 95348
    

    Set a Catchpoint and Continue Execution

    A catchpoint is a breakpoint that breaks whenever the specified system call is made. In the x64 architecture, FUTEX is 202. See Set Catchpoint, Syscalls

    (gdb) catch syscall 202
    Catchpoint 1 (syscall 'futex' [202])
    (gdb) continue
    Continuing.
    

    View Threads

    Both child threads are blocked at a futex.

    (gdb) info threads
      Id   Target Id         Frame
      1    Thread 0x7f7e797e0740 (LWP 95348) "python3" 0x00007f7e78fc87c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x7f7e70000e70) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    * 2    Thread 0x7f7e7811e700 (LWP 95349) "python3" 0x00007f7e78fc87c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1621fc0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
      3    Thread 0x7f7e7791d700 (LWP 95350) "python3" 0x00007f7e78fc87c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1621f90) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    

    Access One Thread

    (gdb) thread 2
    [Switching to thread 2 (Thread 0x7f7e7811e700 (LWP 95349))]
    #0  0x00007f7e78fc87c6 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1621fc0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
    

    Get, Set, and Get the Value at the Address of the Futex

    0 is locked, and 1 is unlocked. For information about assignment, see Assignment

    (gdb) print futex_word
    $1 = (unsigned int *) 0x1621fc0
    (gdb) print *(unsigned int *) 0x1621fc0
    $2 = 0
    (gdb) set var *(unsigned int *) 0x1621fc0 = 1
    (gdb) print *(unsigned int *) 0x1621fc0
    $3 = 1
    

    Remove Catchpoint and Continue

    (gdb) info breakpoints
    Num     Type           Disp Enb Address            What
    1       catchpoint     keep y                      syscall "futex"
            catchpoint already hit 37 times
    (gdb) delete 1
    (gdb) continue
    Continuing.
    

    Application Prints Messages with Exceptions

    Thread thread1 before acquire lock1
    Thread thread1 acquired lock1
    Thread thread2 before acquire lock2
    Thread thread2 acquired lock2
    Thread thread1 before acquire lock2
    Thread thread2 before acquire lock1
    Thread thread1 DEADLOCK: This line will never run.
    Thread thread2 DEADLOCK: This line will never run.
    Exception in thread Thread-2:
    Traceback (most recent call last):
      File "m5.py", line 25, in func2
        print('Thread',name,'DEADLOCK: This line will never run.')
    RuntimeError: release unlocked lock
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
        self.run()
      File "/usr/lib/python3.6/threading.py", line 864, in run
        self._target(*self._args, **self._kwargs)
      File "m5.py", line 25, in func2
        print('Thread',name,'DEADLOCK: This line will never run.')
    RuntimeError: release unlocked lock
    
    

    Threads Exit

    [Thread 0x7f7e7791d700 (LWP 95350) exited]
    [Thread 0x7f7e7811e700 (LWP 95349) exited]
    [Inferior 1 (process 95348) exited normally]