pythoncpythonrecursive-datastructuresrepr

When did CPython's `repr` start handling recursive data structures?


This MWE in the past would cause a stack overflow, as x references y that references x:

class Ref:
    def __init__(self, name):
        self.name = name
        self.value = None

    def __repr__(self):
        if self.value is None:
            return self.name
        return f"{self.name}={self.value!r}"


if __name__ == '__main__':
    x, y = Ref("x"), Ref("y")

    x.value = (1, y)
    y.value = (2, x)

    print(x)
    print(y)

But as I test it with CPython 3.10.4, it works out of the box!

x=(1, y=(2, x=(...)))
y=(2, x=(1, y=(...)))

I can't find when this behavior changed. I see several questions as recent as 2020 wondering how to handle mutually- or self-recursive data structures. I also found about the reprlib builtin library which produces similar output, so I suspect some language dev decided to use it by default.

Note: I also tested it with __str__ and it also works, so it's not specific to repr().


Solution

  • It actually never really did, and still doesn't as of today (version 3.12.0 alpha 0).

    The case you show is the simplest possible one: recursive repr with instances of the same class. In such case, it's pretty easy for the interpreter to detect that the repr is going to cause infinite recursion and therefore stop and produce ... instead: it just needs to check whether the .__repr__() method for the current class is asking for the .__repr__() of an instance of the same class.

    This has been supported since Python 1.5.1 (1998!) as can be seen in Misc/HISTORY:

    ========================================
    ==> Release 1.5.1 (October 31, 1998) <==
    ========================================
    
    [...]
    
    - No longer a core dump when attempting to print (or repr(), or str())
    a list or dictionary that contains an instance of itself; instead, the
    recursive entry is printed as [...] or {...}.  See Py_ReprEnter() and
    Py_ReprLeave() below.  Comparisons of such objects still go beserk,
    since this requires a different kind of fix; fortunately, this is a
    less common scenario in practice.
    

    Any slightly more complex case will still cause trouble even on the latest CPython version:

    class A:
        def __init__(self):
            self.value = None
        def __repr__(self):
            return f"{self.value!r}"
    
    class B:
        def __init__(self):
            self.value = None
        def __repr__(self):
            return f"{self.value!r}"
    
    a, b = A(), B()
    a.value = b
    b.value = a
    
    print(a)
    # RecursionError: maximum recursion depth exceeded while getting the repr of an object