pythonmemoryshared-ptrpybind11

Why is the output of id() applied to a Python object returned from C++ via pybind11 unstable?


I am using a holder class to store a std::shared_ptr to another C++ class created in Python thus:

#include <memory>
#include <iostream>

#include <pybind11/pybind11.h>

class Child {

};

class Holder {
public:
    Holder(std::shared_ptr<Child> child) : m_child {child} {
        print_child_address();
    }

    std::shared_ptr<Child> get_child() {
        return m_child;
    }

    void print_child_address() {
        std::cout << m_child << std::endl;
    }

private:
    std::shared_ptr<Child> m_child;
};

namespace py = pybind11;

PYBIND11_MODULE(pointers, m) {

    py::class_<Child, std::shared_ptr<Child>>(m, "Child")
    .def(py::init<>());

    py::class_<Holder>(m, "Holder")
    .def(py::init<const std::shared_ptr<Child>&>(), py::arg("child"))
    .def("get_child", &Holder::get_child)
    .def("print_child_address", &Holder::print_child_address);
}

CMakeLists.txt:

cmake_minimum_required(VERSION 3.21)
project(pointers VERSION 1.0 LANGUAGES CXX)

list(APPEND CMAKE_PREFIX_PATH "${CMAKE_BINARY_DIR}")
find_package(pybind11 REQUIRED)

pybind11_add_module(pointers pointers.cpp)

This works from Python:

>>> from pointers import *
>>> c = Child()
>>> h = Holder(c)
0x607d93adc800
>>> hex(id(c))
'0x7c9f96460a70'
>>> hex(id(h.get_child()))
'0x7c9f96460a70'

However, when I call it like this, I get a very strange result:

>>> h = Holder(Child())
0x6304fb8a8800
>>> hex(id(h.get_child()))
'0x7dd87bcc4eb0'
>>> hex(id(h.get_child()))
'0x7dd87bcc4c70'
>>> a = h.get_child()
>>> hex(id(h.get_child()))
'0x7dd87bcc4eb0'
>>> hex(id(h.get_child()))
'0x7dd87bcc4eb0'

The id (address) of h.get_child() changes every time until I assign h.get_child() to a variable at which point it becomes stable. In iPython the behaviour is even stranger, in that the address becomes stable after calling h.get_child() on its own, without an assignment.

(Co-incidentally, in this run the final stable address is the same address that I received at the beginning, however this is not normally the case.)

The std::shared_ptr inside the Holder class is stable and everything seems to work on the C++ side but I don't like this apparent instability of the Python object. Can anyone explain what is happening here?


Solution

  • After you call h = Holder(Child()) the (temporary) Python Child object has a reference count of 0 and is destroyed. This does not affect the pointer that Holder retains to the C++ Child object.

    Every time you call id(h.get_child()) a new Python object Child is created, and as that child is not assigned to anything, its reference count is again 0 and it is destroyed. If you call a = h.get_child(), the Python Child object now has a reference, and will not be destroyed (unless you do del a, after which the id will change again probably).

    If there is a reference to the Python object, Python will not destroy it. If there is not a reference it could be destroyed or could not be destroyed (that's why on iPython id() is stable).

    If the Python object already exists, pybind will return the same object, documented here: https://pybind11.readthedocs.io/en/stable/advanced/functions.html#return-value-policies

    One important aspect of the above policies is that they only apply to instances which pybind11 has not seen before, in which case the policy clarifies essential questions about the return value’s lifetime and ownership. When pybind11 knows the instance already (as identified by its type and address in memory), it will return the existing Python object wrapper rather than creating a new copy.