pythonc++pybind11python-embedding

How to retrieve the line number where a C++ function is called from python using pybind11


I'm trying to make a C++ logger class for embedded python script with pybind11. How can I retrieve the line number where a C++ function is called from python?

I have something like this in C++:

class PythonLogger
{
public:
    PythonLogger(const std::string& filename)  { /* opens log file */ }
    ~PythonLogger() { /* closes log file */ }
    void log(const std::string& msg) {
        // writes formated log message to log file:
        // [<current time>] [<script file name>:<line number>]: msg
        // "line number" here should be the line number where 
        // this function is called inside python script
    }
private:
    <some file class> log_file;
};
typedef std::shared_ptr<PythonLogger> PythonLoggerPtr;

PYBIND11_EMBEDDED_MODULE(PythonLogger, m) {
    py::class_<PythonLogger, PythonLoggerPtr>(m, "Logger")
        .def("debug", &PythonLogger::debug);
}

int main()
{
    py::scoped_interpreter guard{};
    PythonLoggerPtr logger = std::make_shared<PythonLogger>("python_log.log");
    try
    {
        auto script = py::module_::import("script");
        script.import("PythonLogger");
        script.attr("logger") = logger;
        auto func = script.attr("func");
        func();
    }
    catch (const std::exception& e)
    {
        std::print("{}\n", e.what());
    }
}

Please ignore that I didn't actually include any headers in this code.

In script.py:

def func():
    logger.debug("debug message")

And if I run this code, it should write this to the log file:

[<current time>] [script.py:2]: debug message

Solution

  • One possibility is to inspect the Python call stack from the C++ function and grab the info about the caller from there. This approach might involve a noticeable overhead -- I haven't measured it, but it would be a good idea before you use this in production.

    You could do this using the standard inspect module, for example by calling inspect.stack().

    py::module inspect_mod = py::module::import("inspect");
    py::list frames = inspect_mod.attr("stack")();
    

    This function returns a list of frame information objects, and we're interested in the first one.

    py::object calling_frame = frames[0];
    

    Now, we want to grab attributes filename (a string) and lineno (an integer).

    py::str filename_py = calling_frame.attr("filename");
    py::int_ line_no_py = calling_frame.attr("lineno");
    

    Next, cast them into C++ types.

    auto const filename = filename_py.cast<std::string>();
    auto const line_no = line_no_py.cast<uint32_t>();
    

    And now you can generate your desired log message.


    Example code:

    #include <chrono>
    #include <cstdint>
    #include <iomanip>
    #include <iostream>
    #include <string>
    
    #include <pybind11/pybind11.h>
    #include <pybind11/embed.h>
    
    namespace py = pybind11;
    
    
    PYBIND11_EMBEDDED_MODULE(testmodule, m)
    {
        m.def("test_log", [](py::str message) {
            py::module inspect_mod = py::module::import("inspect");
            py::list frames = inspect_mod.attr("stack")();
            py::object calling_frame = frames[0];
            py::str filename_py = calling_frame.attr("filename");
            py::int_ line_no_py = calling_frame.attr("lineno");
            auto const filename = filename_py.cast<std::string>();
            auto const line_no = line_no_py.cast<uint32_t>();
    
            using std::chrono::system_clock;
            auto const timestamp = system_clock::to_time_t(system_clock::now());
    
            std::cout << "["
                << std::put_time(std::localtime(&timestamp), "%FT%T%z")
                << "] [" << filename << ":" << line_no << "]: "
                << message.cast<std::string>() << "\n";
        });
    }
    
    int main()
    {
        py::scoped_interpreter guard{};
    
        try {
            py::exec(R"(\
    import testmodule
    import test_script
    
    test_script.foo()
    testmodule.test_log("From embedded code fragment.")
    )");
        } catch (py::error_already_set& e) {
            std::cerr << e.what() << "\n";
        }
    
    }
    

    Python script test_script.py used by the above example:

    import testmodule
    
    def foo():
        testmodule.test_log("On line 4 in foo().")
    
    testmodule.test_log("On line 6.")
    

    Example output:

    g:\example>so07.exe
    [2023-12-11T18:31:39+0100] [g:\example\test_script.py:6]: On line 6.
    [2023-12-11T18:31:39+0100] [g:\example\test_script.py:4]: On line 4 in foo().
    [2023-12-11T18:31:39+0100] [<string>:7]: From embedded code fragment.
    

    Notes

    One improvement would be to cache the inspect.stack function in persistent logger object, so you don't need to fetch it for every message.

    Another would be to rewrite it to directly use Python C API to extract the relevant frame info without round-trip to the inspect implementation. One some further inspection, this might be a moving target relying on implementation details that can (and do) change over time. Some avenues for research nevertheless:


    Alternate Approach

    After reading through the code of the inspect module, I've arrived at following approach using sys._getframe and the frame object directly:

    py::module sys_mod = py::module::import("sys");
    py::object calling_frame = sys_mod.attr("_getframe")(0);
    py::str filename_py = calling_frame.attr("f_code").attr("co_filename");
    py::int_ line_no_py = calling_frame.attr("f_lineno");
    

    The rest would be the same.

    Cache the result of sys_mod.attr("_getframe") to avoid fetching it every time.

    py::object getframe_fn = py::module::import("sys").attr("_getframe");
    
    // ... sometime later ...
    py::object calling_frame = getframe_fn(0);
    

    However, if you do cache that function, you will probably have to make sure the cached object's lifetime doesn't exceed the lifetime of the interpreter. I'll leave it up to the reader to figure that out and handle.