pythonpdb

Is it possible to change the return value of a function with pdb?


Let's say I have the following code :

def returns_false():
  breakpoint()
  return False

assert(returns_false())
print("Hello world")

Is there a sequence of pdb commands that will print "Hello world" without triggering an AssertionError first ?

I can't modify a single character of this source file, i'm only looking for what I can achieve while it's already running.

What I tried :

But none of that change the actual return value


Solution

  • This is actually possible in a dynamic language such as Python, where the code is interpreted line by line and can be self-modifying.

    You'll have a lot of typing to do, but here's what you can type into the debugger:

    import ctypes
    tup = returns_false.__code__.co_consts
    obj = ctypes.py_object(tup)
    pos = ctypes.c_ssize_t(1)
    o = ctypes.py_object(True)
    ref_count = ctypes.c_long.from_address(id(tup))
    original_count = ref_count.value
    ref_count.value = 1
    ctypes.pythonapi.Py_IncRef(o)
    ctypes.pythonapi.PyTuple_SetItem(obj, pos, o)
    ref_count.value = original_count
    c
    

    This modifies the return value itself in memory, and will cause returns_false to return True instead of False. The subsequent assertion after exiting pdb will pass. There is no sleight-of-hand here, and no modifying of the original source files, we're literally changing the return value at runtime.

    The last line here "c" is the shortcut for "continue" in pdb, and will exit the debugger.

    Explanation

    I'm using Python 3.12.4 here. Some implementation detail may be different in other Python versions, but the same basic technique should work.

    For the original source code:

    def returns_false():
      breakpoint()
      return False
    
    assert(returns_false())
    print("Hello world")
    

    Consider the disassembly of your function using stdlib dis module:

    >>> import dis
    >>> dis.dis(returns_false)
      1           0 RESUME                   0
    
      2           2 LOAD_GLOBAL              1 (NULL + breakpoint)
                 12 CALL                     0
                 20 POP_TOP
    
      3          22 RETURN_CONST             1 (False)
    

    The last line indicates the return value, RETURN_CONST (False). You'll also see three numbers on the last line of the disassembly:

    This last bullet point is interesting. It actually means the function is returning item 1 from the consts table of the function object. dis has helpfully indicated this item is "False" in parentheses, but it's just rendering whatever item 1 in the consts table is:

    >>> returns_false.__code__.co_consts
    (None, False)
    

    The consts table will be longer if you have more consts in your function body, for example if you added the line x = 1234 inside the function you'd expect to see 1234 in the consts table, and the return value would now be found at index 2 instead (the pos in my example would also have to be changed accordingly).

    So the consts table is a tuple, which is an immutable type, but what if we could modify that tuple (everything in Python is mutable, if you know where to look). Would changing the item at index 1 change the return value of the function? Indeed, it would.

    Part of the C API is PyTuple_SetItem

    int PyTuple_SetItem(PyObject *p, Py_ssize_t pos, PyObject *o)

    Insert a reference to object o at position pos of the tuple pointed to by p. Return 0 on success. If pos is out of bounds, return -1 and set an IndexError exception.

    And the CPython devs are even so helpful as to provide a Python API in ctypes.pythonapi to use functions such as PyTuple_SetItem directly from within the runtime.

    The rest of the answer is a matter of technique, and some know-how about the implementation, taking care not to cause a segfault or screw up reference counting.

    Note that the disassembly of the "hacked" function will respect updates to the consts table, and will now render RETURN_CONST with (True):

    >>> returns_false.__code__.co_consts
    (None, False)
    >>> import ctypes
    ... tup = returns_false.__code__.co_consts
    ... obj = ctypes.py_object(tup)
    ... pos = ctypes.c_ssize_t(1)
    ... o = ctypes.py_object(True)
    ... ref_count = ctypes.c_long.from_address(id(tup))
    ... original_count = ref_count.value
    ... ref_count.value = 1
    ... ctypes.pythonapi.Py_IncRef(o)
    ... ctypes.pythonapi.PyTuple_SetItem(obj, pos, o)
    ... ref_count.value = original_count
    ... 
    >>> returns_false.__code__.co_consts
    (None, True)
    >>> returns_false()
    > /private/tmp/p.py(3)returns_false()
    -> return False
    (Pdb) c
    True
    >>> import dis
    >>> dis.dis(returns_false)
      1           0 RESUME                   0
    
      2           2 LOAD_GLOBAL              1 (NULL + breakpoint)
                 12 CALL                     0
                 20 POP_TOP
    
      3          22 RETURN_CONST             1 (True)
    

    Finally, I'll mention that RETURN_CONST is not the only return opcode for a function, in fact it is new in Python 3.12. RETURN_VALUE is likely more common. In other Python 3.x (I've checked 3.6-3.11) your code will use RETURN_VALUE op, but the exact same patch typed verbatim in the debugger will work, because it will just be returning a value which was previously loaded from the consts table onto the stack. The disassembly detail will look different on Python 3.6-3.10, 3.11 and 3.12.

    You may also be interested in the Q&A it is possible to monkeypatch a local variable introduced in a function body? where I have demonstrated a similar technique for mutating local variables of a function. In that answer I've also changed the return of a function which was using a RETURN_VALUE opcode.