pythongarbage-collectioncythoncyclic-reference

Does `gc` treat Cython's `__dealloc__` similarly to `__del__`?


Python's optional garbage collector gc ignores cycles that contain any object with a __del__ method:

Changed in version 3.4: Following PEP 442, objects with a __del__() method don’t end up in gc.garbage anymore.

Cython extension types can have a __dealloc__ method, but no __del__ method:

Note: There is no __del__() method for extension types.

For the purpose of collecting cycles, is the presence of __dealloc__ treated by the garbage collector as if a __del__ method is present? Or is __dealloc__ invisible to the garbage collector?


Solution

  • If you look at the generated C code you can see that Cython generates the destructor in the tp_dealloc slot, rather than the tp_del slot

    cdef class B:
        def __dealloc__(self):
            pass
    

    generates:

    static PyTypeObject __pyx_type_5cy_gc_B = {
      PyVarObject_HEAD_INIT(0, 0)
      "cy_gc.B", /*tp_name*/
      sizeof(struct __pyx_obj_5cy_gc_B), /*tp_basicsize*/
      0, /*tp_itemsize*/
      __pyx_tp_dealloc_5cy_gc_B, /*tp_dealloc*/
    
      /* lines omitted */
    
      0, /*tp_del*/
      0, /*tp_version_tag*/
      #if PY_VERSION_HEX >= 0x030400a1
      0, /*tp_finalize*/
      #endif
    };
    

    You can easily verify that this is the case for other examples too (e.g. classes with automatically generated __dealloc__).

    Therefore, for Python 3.4+:

    Starting with Python 3.4, this list should be empty most of the time, except when using instances of C extension types with a non-NULL tp_del slot.

    Cython classes should not end up in this list of uncollectable stuff since they don't have tp_del defined.


    For earlier versions of Python I think you're also fine. Mostly because you still don't have a __del__ method, but also because cython automatically generates tp_traverse and tp_clear functions that should allow Python to break reference cycles involve Cython classes.

    You can disable the generation of these tp_traverse and tp_clear functions. I'm a little unclear on what happens to objects that are in a reference cycle but don't have methods to detect it, or to break it. It's quite likely that they just continue to exist somewhere, but are inaccessible.


    I think the concern (before Python 3.4) was that __del__ methods could make an object accessible again:

    class C:
       def __del__(self):
          global x
          x = self
    

    __dealloc__ is called after the point of no return, and so this isn't allowed (you just get a segmentation fault if you access x). Therefore they don't have to be stuck in gc.garbage in their indeterminate state.