pythonc++11pybind11python-bindings

How can you bind exceptions with custom fields and constructors in pybind11 and still have them function as python exception?


This appears to be a known limitation in pybind11. I read through all the docs, whatever bug reports seemed applicable, and everything I could find in the pybind11 gitter. I have a custom exception class in c++ that contains custom constructors and fields. A very basic example of such a class, trimmed for space is here:

class BadData : public std::exception
{
  public:
    // Constructors
    BadData()
      : msg(),
        stack(),
        _name("BadData")
    {}

    BadData(std::string _msg, std::string _stack)
      : msg(_msg),
        stack(_stack),
        _name("BadData")
    {}

    const std::string&
    getMsg() const
    {
      return msg;
    }

    void
    setMsg(const std::string& arg)
    {
      msg = arg;
    }

    // Member stack
    const std::string&
    getStack() const
    {
      return stack;
    }

    void
    setStack(const std::string& arg)
    {
      stack = arg;
    }
  private:
    std::string msg;
    std::string stack;
    std::string _name;

I currently have python binding code that binds this into python, but it is custom generated and we'd much rather use pybind11 due to its simplicity and compile speed.

The default mechanism for binding an exception into pybind11 would look like

py::register_exception<BadData>(module, "BadData");

That will create an automatic translation between the C++ exception and the python exception, with the what() value of the c++ exception translating into the message of the python exception. However, all the extra data from the c++ exception is lost and if you're trying to throw the exception in python and catch it in c++, you cannot throw it with any of the extra data.

You can bind extra data onto the python object using the attr and I even went somewhat down the path of trying to extend the pybind11:exception class to make it easier to add custom fields to exceptions.

  template <typename type>
  class exception11 : public ::py::exception<type>
  {
   public:
    exception11(::py::handle scope, const char *name, PyObject *base = PyExc_Exception)
      : ::py::exception<type>(scope, name, base)
    {}

    template <typename Func, typename... Extra>
    exception11 &def(const char *name_, Func&& f, const Extra&... extra) {
      ::py::cpp_function cf(::py::method_adaptor<type>(std::forward<Func>(f)),
                            ::py::name(name_),
                            ::py::is_method(*this),
                            ::py::sibling(getattr(*this, name_, ::py::none())),
                            extra...);
      this->attr(cf.name()) = cf;
      return *this;
    }
  };

This adds a def function to exceptions similar to what is done with class_. The naive approach to using this doesn't work

    exception11< ::example::data::BadData>(module, "BadData")
      .def("getStack", &::example::data::BadData::getStack);

Because there is no automatic translation between BadData in c++ and in python. You can try to work around this by binding in a lambda:

    .def("getStack", [](py::object& obj) {
      ::example::data::BadData *cls = obj.cast< ::example::data::BadData* >();
      return cls->getStack();
    });

The obj.cast there also fails because there is no automatic conversion. Basically, with no place to store the c++ instance, there isn't really a workable solution for this approach that I could find. In addition I couldn't find a way to bind in custom constructors at all, which made usability on python very weak.

The next attempt was based on a suggestion in the pybind11 that you could use the python exception type as a metaclass a normal class_ and have python recognize it as a valid exception. I tried a plethora of variations on this approach.

py::class_< ::example::data::BadData >(module, "BadData", py::dynamic_attr(), py::reinterpret_borrow<py::object>(PyExc_Exception))
py::class_< ::example::data::BadData >(module, "BadData", py::dynamic_attr(), py::cast(PyExc_Exception))
py::class_< ::example::data::BadData >(module, "BadData", py::dynamic_attr(), py::cast(PyExc_Exception->ob_type))
py::class_< ::example::data::BadData>(module, "BadData", py::metaclass((PyObject *) &PyExc_Exception->ob_type))

There were more that I don't have saved. But the overall results was either 1) It was ignored completely or 2) it failed to compile or 3) It compiled and then immediately segfaulted or ImportError'd when trying to make an instance. There might have been one that segfaulted on module import too. It all blurs together. Maybe there is some magic formula that would make such a thing work, but I was unable to find it. From my reading of the pybind11 internals, I do not believe that such a thing is actually possible. Inheriting from a raw python type does not seem to be something it is setup to let you do.

The last thing I tried seemed really clever. I made a python exception type

  static py::exception<::example::data::BadData> exc_BadData(module, "BadDataBase");

and then had my pybind11 class_ inherit from that.

  py::class_< ::example::data::BadData >(module, "BadData", py::dynamic_attr(), exc_BadData)

But that also segfaulted on import too. So I'm basically back to square one with this.


Solution

  • So I figured out a way to actually do this but it involves 1) doing some hacking of the pybind11 code itself and 2) introducing some size inefficiencies to the bound python types. From my point of view, the size issues are fairly immaterial. Yes it would be better to have everything perfectly sized but I'll take some extra bytes of memory for ease of use. Given this inefficiency, though, I'm not submitting this as a PR to the pybind11 project. While I think the trade-off is worth it, I doubt that making this the default for most people would be desired. It would be possible, I guess to hide this functionality behind a #define in c++ but that seems like it would be super messy long-term. There is probably a better long-term answer that would involve a degree of template meta-programming (parameterizing on the python container type for class_) that I'm just not up to.

    I'm providing my changes here as diffs against the current master branch in git when this was written (hash a54eab92d265337996b8e4b4149d9176c2d428a6).

    The basic approach was

    1. Modify pybind11 to allow the specification of an exception base class for a class_ instance.
    2. Modify pybind11's internal container to have the extra fields needed for a python exception type
    3. Write a small amount of custom binding code to handle setting the error correctly in python.

    For the first part, I added a new attribute to type_record to specify if a class is an exception and added the associated process_attribute call for parsing it.

    diff --git a/src/pybind11/include/pybind11/attr.h b/src/pybind11/include/pybind11/attr.h
    index 58390239..b5535558 100644
    --- a/src/pybind11/include/pybind11/attr.h
    +++ b/src/pybind11/include/pybind11/attr.h
    @@ -73,6 +73,9 @@ struct module_local { const bool value; constexpr module_local(bool v = true) :
     /// Annotation to mark enums as an arithmetic type
     struct arithmetic { };
    
    +// Annotation that marks a class as needing an exception base type.
    +struct is_except {};
    +
     /** \rst
         A call policy which places one or more guard variables (``Ts...``) around the function call.
    
    @@ -211,7 +214,8 @@ struct function_record {
     struct type_record {
         PYBIND11_NOINLINE type_record()
             : multiple_inheritance(false), dynamic_attr(false), buffer_protocol(false),
     -          default_holder(true), module_local(false), is_final(false) { }
     -          default_holder(true), module_local(false), is_final(false),
     -          is_except(false) { }
    
         /// Handle to the parent scope
         handle scope;
    @@ -267,6 +271,9 @@ struct type_record {
         /// Is the class inheritable from python classes?
         bool is_final : 1;
    
     -    // Does the class need an exception base type?
     -    bool is_except : 1;
     -      PYBIND11_NOINLINE void add_base(const std::type_info &base, void *(*caster)(void *)) {
             auto base_info = detail::get_type_info(base, false);
             if (!base_info) {
    @@ -451,6 +458,11 @@ struct process_attribute<is_final> : process_attribute_default<is_final> {
         static void init(const is_final &, type_record *r) { r->is_final = true; }
     };
    
    +template <>
    +struct process_attribute<is_except> : process_attribute_default<is_except> {
     -    static void init(const is_except &, type_record *r) { r->is_except = true; }
    +};
    
    

    I modified the internals.h file to add a separate base class for exception types. I also added an extra bool argument to make_object_base_type.

    diff --git a/src/pybind11/include/pybind11/detail/internals.h b/src/pybind11/include/pybind11/detail/internals.h
    index 6224dfb2..d84df4f5 100644
    --- a/src/pybind11/include/pybind11/detail/internals.h
    +++ b/src/pybind11/include/pybind11/detail/internals.h
    @@ -16,7 +16,7 @@ NAMESPACE_BEGIN(detail)
     // Forward declarations
     inline PyTypeObject *make_static_property_type();
     inline PyTypeObject *make_default_metaclass();
    -inline PyObject *make_object_base_type(PyTypeObject *metaclass);
    +inline PyObject *make_object_base_type(PyTypeObject *metaclass, bool is_except);
    
     // The old Python Thread Local Storage (TLS) API is deprecated in Python 3.7 in favor of the new
     // Thread Specific Storage (TSS) API.
    @@ -107,6 +107,7 @@ struct internals {
         PyTypeObject *static_property_type;
         PyTypeObject *default_metaclass;
         PyObject *instance_base;
    +    PyObject *exception_base;
     #if defined(WITH_THREAD)
         PYBIND11_TLS_KEY_INIT(tstate);
         PyInterpreterState *istate = nullptr;
    @@ -292,7 +293,8 @@ PYBIND11_NOINLINE inline internals &get_internals() {
             internals_ptr->registered_exception_translators.push_front(&translate_exception);
             internals_ptr->static_property_type = make_static_property_type();
             internals_ptr->default_metaclass = make_default_metaclass();
    -        internals_ptr->instance_base = make_object_base_type(internals_ptr->default_metaclass);
    +        internals_ptr->instance_base = make_object_base_type(internals_ptr->default_metaclass, false);
    +        internals_ptr->exception_base = make_object_base_type(internals_ptr->default_metaclass, true);
    

    And then in class.h I added the necessary code to generate the exception base type. The first caveat is here. Since PyExc_Exception is a garbage collected type, I had to scope the assert call that checked the GC flag on the type. I have not currently seen any bad behavior from this change, but this is definitely voiding the warranty right here. I would highly, highly recommend always passing the py:dynamic_attr() flag to any classes you are using py:except on, since that turns on all the necessary bells and whistles to handle GC correctly (I think). A better solution might be to turn all those things on in make_object_base_type without having to invoke py::dynamic_attr.

    diff --git a/src/pybind11/include/pybind11/detail/class.h b/src/pybind11/include/pybind11/detail/class.h
    index a05edeb4..bbb9e772 100644
    --- a/src/pybind11/include/pybind11/detail/class.h
    +++ b/src/pybind11/include/pybind11/detail/class.h
    @@ -368,7 +368,7 @@ extern "C" inline void pybind11_object_dealloc(PyObject *self) {
     /** Create the type which can be used as a common base for all classes.  This is
         needed in order to satisfy Python's requirements for multiple inheritance.
         Return value: New reference. */
    -inline PyObject *make_object_base_type(PyTypeObject *metaclass) {
    +inline PyObject *make_object_base_type(PyTypeObject *metaclass, bool is_except=false) {
         constexpr auto *name = "pybind11_object";
         auto name_obj = reinterpret_steal<object>(PYBIND11_FROM_STRING(name));
    
    @@ -387,7 +387,12 @@ inline PyObject *make_object_base_type(PyTypeObject *metaclass) {
    
         auto type = &heap_type->ht_type;
         type->tp_name = name;
    -    type->tp_base = type_incref(&PyBaseObject_Type);
    +    if (is_except) {
    +      type->tp_base = type_incref(reinterpret_cast<PyTypeObject*>(PyExc_Exception));
    +    }
    +    else {
    +      type->tp_base = type_incref(&PyBaseObject_Type);
    +    }
         type->tp_basicsize = static_cast<ssize_t>(sizeof(instance));
         type->tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HEAPTYPE;
    
    @@ -404,7 +409,9 @@ inline PyObject *make_object_base_type(PyTypeObject *metaclass) {
         setattr((PyObject *) type, "__module__", str("pybind11_builtins"));
         PYBIND11_SET_OLDPY_QUALNAME(type, name_obj);
    
    -    assert(!PyType_HasFeature(type, Py_TPFLAGS_HAVE_GC));
    +    if (!is_except) {
    +      assert(!PyType_HasFeature(type, Py_TPFLAGS_HAVE_GC));
    +    }
         return (PyObject *) heap_type;
     }
    
    @@ -565,7 +572,8 @@ inline PyObject* make_new_python_type(const type_record &rec) {
    
         auto &internals = get_internals();
         auto bases = tuple(rec.bases);
    -    auto base = (bases.size() == 0) ? internals.instance_base
    +    auto base = (bases.size() == 0) ? (rec.is_except ? internals.exception_base
    +                                                     : internals.instance_base)
    

    And then the final change, which is the inefficiency part. In Python, everything is a PyObject, but that is really only two fields (setup with the PyObject_HEAD macro) and the actual object struct may have a lot of extra fields. And having a very precise layout is important because python uses offsetof to seek into these things some times. From the Python 2.7 source code (Include/pyerrord.h) you can see the struct that is used for base exceptions

    typedef struct {
        PyObject_HEAD
        PyObject *dict;
        PyObject *args;
        PyObject *message;
    } PyBaseExceptionObject;
    

    Any pybind11 type that extends PyExc_Exception has to have a instance struct that contains the same initial layout. And in pybind11 currently, the instance struct just has PyObject_HEAD. That means if you don't change the instance struct, this will all compile, but when python seeks into this object, it will do with the assumption that hose extra fields exist and then it will seek right off the end of viable memory and you'll get all sorts of fun segfaults. So this change adds those extra fields to every class_ in pybind11. It does not seem to break normal classes to have these extra fields and it definitely seems to make exceptions work correctly. If we broke the warranty before, we just tore it up and lit it on fire.

    diff --git a/src/pybind11/include/pybind11/detail/common.h b/src/pybind11/include/pybind11/detail/common.h
    index dd626793..b32e0c70 100644
    --- a/src/pybind11/include/pybind11/detail/common.h
    +++ b/src/pybind11/include/pybind11/detail/common.h
    @@ -392,6 +392,10 @@ struct nonsimple_values_and_holders {
     /// The 'instance' type which needs to be standard layout (need to be able to use 'offsetof')
     struct instance {
         PyObject_HEAD
    +    // Necessary to support exceptions.
    +    PyObject *dict;
    +    PyObject *args;
    +    PyObject *message;
         /// Storage for pointers and holder; see simple_layout, below, for a description
    

    However, once these changes are all made, here is what you can do. Bind in the class

     auto PyBadData = py::class_< ::example::data::BadData>(module, "BadData", py::is_except(), py::dynamic_attr())
        .def(py::init<>())
        .def(py::init< std::string, std::string >())
        .def("__str__", &::example::data::BadData::toString)
        .def("getStack", &::example::data::BadData::getStack)
        .def_property("message", &::example::data::BadData::getMsg, &::example::data::BadData::setMsg)
        .def("getMsg", &::example::data::BadData::getMsg);
    

    And take a function in c++ that throws the exception

    void raiseMe()
    {
      throw ::example::data::BadData("this is an error", "");
    }
    

    and bind that in too

    module.def("raiseMe", &raiseMe, "A function throws");
    

    Add an exception translator to put the entire python type into the exception

        py::register_exception_translator([](std::exception_ptr p) {
          try {
              if (p) {
                std::rethrow_exception(p);
              }
          } catch (const ::example::data::BadData &e) {
            auto err = py::cast(e);
            auto errType = err.get_type().ptr();
            PyErr_SetObject(errType, err.ptr());
          }
        });
    

    And then you get all the things you could want!

    >>> import example
    >>> example.raiseMe()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    example.BadData: BadData(msg=this is an error, stack=)
    

    You can, of course, also instantiate and raise the exception from python as well

    >>> import example
    >>> raise example.BadData("this is my error", "no stack")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    example.BadData: BadData(msg=this is my error, stack=no stack)