pythonreverse-engineeringbytecodecpythondisassembly

In C python, accessing the bytecode evaluation stack


Given a C Python frame pointer, how do I look at arbitrary evaluation stack entries? (Some specific stack entries can be found via locals(), I'm talking about other stack entries.)

I asked a broader question like this a while ago:

getting the C python exec argument string or accessing the evaluation stack

but here I want to focus on being able to read CPython stack entries at runtime.

I'll take a solution that works on CPython 2.7 or any Python later than Python 3.3. However if you have things that work outside of that, share that and, if there is no better solution I'll accept that.

I'd prefer not modifying the C Python code. In Ruby, I have in fact done this to get what I want. I can speak from experience that this is probably not the way we want to work. But again, if there's no better solution, I'll take that. (My understanding wrt to SO points is that I lose it in the bounty either way. So I'm happy go see it go to the person who has shown the most good spirit and willingness to look at this, assuming it works.)

update: See the comment by user2357112 tldr; Basically this is hard-to-impossible to do. (Still, if you think you have the gumption to try, by all means do so.)

So instead, let me narrow the scope to this simpler problem which I think is doable:

Given a python stack frame, like inspect.currentframe(), find the beginning of the evaluation stack. In the C version of the structure, this is f_valuestack. From that we then need a way in Python to read off the Python values/objects from there.

update 2 well the time period for a bounty is over and no one (including my own summary answer) has offered concrete code. I feel this is a good start though and I now understand the situation much more than I had. In the obligatory "describe why you think there should be a bounty" I had listed one of the proffered choices "to draw more attention to this problem" and to that extent where there had been something less than a dozen views of the prior incarnation of the problem, as I type this it has been viewed a little under 190 times. So this is a success. However...

If someone in the future decides to carry this further, contact me and I'll set up another bounty.

Thanks all.


Solution

  • I think this is possible now because CPython introduced a stack frame evaluation API in PEP523. PyTorch's Dynamo compiler uses this API to rewrite code objects with torch-specific operators.

    You can use CPython's API to install a hook and observe frames right before VM execution. You can suspend Python's main thread by acquiring and releasing the GIL. While the interpreter's frame itself is immutable, you can rewrite its contents (the execution unit/Code object). Also, make sure to handle ref counting of objects that's owned by the Python's runtime to avoid leaks or crashes.

    #include <Python.h>
    #include <stdio.h>
    #include "internal/pycore_frame.h"
    
    static PyObject* my_frame_eval(PyThreadState* tstate, struct _PyInterpreterFrame* frame, int flag) {
        PyCodeObject* code = frame->f_code;
        if (code) {
            printf("Executing frame: %s", PyUnicode_AsUTF8(code->co_name));
        }
    
        return _PyEval_EvalFrameDefault(tstate, frame, flag);
    }
    
    static PyObject* install_hook_py(PyObject* self, PyObject* args) {
        PyThreadState* tstate = PyThreadState_Get();
        _PyInterpreterState_SetEvalFrameFunc(tstate->interp, my_frame_eval);
        Py_RETURN_NONE;
    }
    
    static PyMethodDef _methods[] = {
        {"install_hook", install_hook_py, METH_NOARGS, NULL}
    };
    
    static struct PyModuleDef evalstack = {
        PyModuleDef_HEAD_INIT,
        "evalstack",
        NULL,
        -1,
        _methods
    };
    
    PyMODINIT_FUNC PyInit_evalstack(void) {
        return PyModule_Create(&evalstack);
    }
    

    on macOS compile and link with python's version specific headers

    clang -shared -o evalstack.so evalstack.c $(python3.12-config --includes) /opt/homebrew/opt/python@3.12/Frameworks/Python.framework/Versions/3.12/Python

    import evalstack
    evalstack.install_hook()
    
    def foo(): print("hello from python world")
    foo()