pythonlist

Why does `list()` call `__len__()`?


The setup code:

class MyContainer:
    def __init__(self):
        self.stuff = [1, 2, 3]

    def __iter__(self):
        print("__iter__")
        return iter(self.stuff)

    def __len__(self):
        print("__len__")
        return len(self.stuff)

mc = MyContainer()

Now, in my shell:

>>> i = iter(mc)
__iter__
>>> [x for x in i]
[1, 2, 3]
>>> list(mc)
__iter__
__len__
[1, 2, 3]

Why is __len__() getting called by list()? And where is that documented?


Solution

  • The behavior of calling __len__ of the given iterable during initialization of a new list is an implementation detail and is meant to help pre-allocate memory according to the estimated size of the result list, as opposed to naively and inefficiently grow the list as it is iteratively extended with items produced by a given generic iterable.

    You can find the logics in Objects/listobject.c of CPython, where it defaults the pre-allocation of memory to a size of 8 if the iterable has neither __len__ nor __length_hint__, which is documented in PEP-424:

    static int
    list_extend_iter_lock_held(PyListObject *self, PyObject *iterable)
    {
        PyObject *it = PyObject_GetIter(iterable);
        if (it == NULL) {
            return -1;
        }
        PyObject *(*iternext)(PyObject *) = *Py_TYPE(it)->tp_iternext;
    
        /* Guess a result list size. */
        Py_ssize_t n = PyObject_LengthHint(iterable, 8);
        ...
    }