python class attributes python-datamodel

Why are the class dict and weakref never re-defined in Python?

Class creation seems to never re-define the __dict__ and __weakref__ class attributes (i.e. if they already exist in the dictionary of a superclass, they are not added to the dictionaries of its subclasses), but to always re-define the __doc__ and __module__ class attributes. Why?

>>> class A: pass
... 
>>> class B(A): pass
... 
>>> class C(B): __slots__ = ()
... 
>>> vars(A)
mappingproxy({'__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'A' objects>,
              '__weakref__': <attribute '__weakref__' of 'A' objects>,
              '__doc__': None})
>>> vars(B)
mappingproxy({'__module__': '__main__', '__doc__': None})
>>> vars(C)
mappingproxy({'__module__': '__main__', '__slots__': (), '__doc__': None})

>>> class A: __slots__ = ()
... 
>>> class B(A): pass
... 
>>> class C(B): pass
... 
>>> vars(A)
mappingproxy({'__module__': '__main__', '__slots__': (), '__doc__': None})
>>> vars(B)
mappingproxy({'__module__': '__main__',
              '__dict__': <attribute '__dict__' of 'B' objects>,
              '__weakref__': <attribute '__weakref__' of 'B' objects>,
              '__doc__': None})
>>> vars(C)
mappingproxy({'__module__': '__main__', '__doc__': None})

Solution

The '__dict__' and '__weakref__' entries in a class's __dict__ (when present) are descriptors used for retrieving an instance's dict pointer and weakref pointer from the instance memory layout. They're not the actual class's __dict__ and __weakref__ attributes - those are managed by descriptors on the metaclass.

There's no point adding those descriptors if a class's ancestors already provide one. However, a class does need its own __module__ and __doc__, regardless of whether its parents already have one - it doesn't make sense for a class to inherit its parent's module name or docstring.

You can see the implementation in type_new, the (very long) C implementation of type.__new__. Look for the add_weak and add_dict variables - those are the variables that determine whether type.__new__ should add space for __dict__ and __weakref__ in the class's instance memory layout. If type.__new__ decides it should add space for one of those attributes to the instance memory layout, it also adds getset descriptors to the class (through tp_getset) to retrieve the attributes:

if (add_dict) {
    if (base->tp_itemsize)
        type->tp_dictoffset = -(long)sizeof(PyObject *);
    else
        type->tp_dictoffset = slotoffset;
    slotoffset += sizeof(PyObject *);
}
if (add_weak) {
    assert(!base->tp_itemsize);
    type->tp_weaklistoffset = slotoffset;
    slotoffset += sizeof(PyObject *);
}
type->tp_basicsize = slotoffset;
type->tp_itemsize = base->tp_itemsize;
type->tp_members = PyHeapType_GET_MEMBERS(et);

if (type->tp_weaklistoffset && type->tp_dictoffset)
    type->tp_getset = subtype_getsets_full;
else if (type->tp_weaklistoffset && !type->tp_dictoffset)
    type->tp_getset = subtype_getsets_weakref_only;
else if (!type->tp_weaklistoffset && type->tp_dictoffset)
    type->tp_getset = subtype_getsets_dict_only;
else
    type->tp_getset = NULL;

If add_dict or add_weak are false, no space is reserved and no descriptor is added. One condition for add_dict or add_weak to be false is if one of the parents already reserved space:

add_dict = 0;
add_weak = 0;
may_add_dict = base->tp_dictoffset == 0;
may_add_weak = base->tp_weaklistoffset == 0 && base->tp_itemsize == 0;

This check doesn't actually care about any ancestor descriptors, just whether an ancestor reserved space for an instance dict pointer or weakref pointer, so if a C ancestor reserved space without providing a descriptor, the child won't reserve space or provide a descriptor. For example, set has a nonzero tp_weaklistoffset, but no __weakref__ descriptor, so descendants of set won't provide a __weakref__ descriptor either, even though instances of set (including subclass instances) support weak references.

You'll also see an && base->tp_itemsize == 0 in the initialization for may_add_weak - you can't add weakref support to a subclass of a class with variable-length instances.

Why are the class __dict__ and __weakref__ never re-defined in Python?

Why are the class dict and weakref never re-defined in Python?