pythongeneratoryieldpython-internalsyield-from

yield from vs yield in for-loop


My understanding of yield from is that it is similar to yielding every item from an iterable. Yet, I observe the different behavior in the following example.

I have Class1

class Class1:
    def __init__(self, gen):
        self.gen = gen
        
    def __iter__(self):
        for el in self.gen:
            yield el

and Class2 that different only in replacing yield in for loop with yield from

class Class2:
    def __init__(self, gen):
        self.gen = gen
        
    def __iter__(self):
        yield from self.gen

The code below reads the first element from an instance of a given class and then reads the rest in a for loop:

a = Class1((i for i in range(3)))
print(next(iter(a)))
for el in iter(a):
    print(el)

This produces different outputs for Class1 and Class2. For Class1 the output is

0
1
2

and for Class2 the output is

0

Live demo

What is the mechanism behind yield from that produces different behavior?


Solution

  • What Happened?

    When you use next(iter(instance_of_Class2)), iter() calls .close() on the inner generator when it (the iterator, not the generator!) goes out of scope (and is deleted), while with Class1, iter() only closes its instance

    >>> g = (i for i in range(3))
    >>> b = Class2(g)
    >>> i = iter(b)     # hold iterator open
    >>> next(i)
    0
    >>> next(i)
    1
    >>> del(i)          # closes g
    >>> next(iter(b))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    This behavior is described in PEP 342 in two parts

    What happens is a little clearer (if perhaps surprising) when multiple generator delegations occur; only the generator being delegated is closed when its wrapping iter is deleted

    >>> g1 = (a for a in range(10))
    >>> g2 = (a for a in range(10, 20))
    >>> def test3():
    ...     yield from g1
    ...     yield from g2
    ... 
    >>> next(test3())
    0
    >>> next(test3())
    10
    >>> next(test3())
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    Fixing Class2

    What options are there to make Class2 behave more the way you expect?

    Notably, other strategies, though they don't have the visually pleasing sugar of yield from or some of its potential benefits gives you a way to interact with the values, which seems like a primary benefit


    Hunting the Mystery

    A better clue is that if you directly try again, next(iter(instance)) raises StopIteration, indicating the generator is permanently closed (either through exhaustion or .close()), and why iterating over it with a for loop yields no more values

    >>> a = Class1((i for i in range(3)))
    >>> next(iter(a))
    0
    >>> next(iter(a))
    1
    >>> b = Class2((i for i in range(3)))
    >>> next(iter(b))
    0
    >>> next(iter(b))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    However, if we name the iterator, it works as expected

    >>> b = Class2((i for i in range(3)))
    >>> i = iter(b)
    >>> next(i)
    0
    >>> next(i)
    1
    >>> j = iter(b)
    >>> next(j)
    2
    >>> next(i)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    To me, this suggests that when the iterator doesn't have a name, it calls .close() when it goes out of scope

    >>> def gen_test(iterable):
    ...     yield from iterable
    ... 
    >>> g = gen_test((i for i in range(3)))
    >>> next(iter(g))
    0
    >>> g.close()
    >>> next(iter(g))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    Disassembling the result, we find the internals are a little different

    >>> a = Class1((i for i in range(3)))
    >>> dis.dis(a.__iter__)
      6           0 LOAD_FAST                0 (self)
                  2 LOAD_ATTR                0 (gen)
                  4 GET_ITER
            >>    6 FOR_ITER                10 (to 18)
                  8 STORE_FAST               1 (el)
    
      7          10 LOAD_FAST                1 (el)
                 12 YIELD_VALUE
                 14 POP_TOP
                 16 JUMP_ABSOLUTE            6
            >>   18 LOAD_CONST               0 (None)
                 20 RETURN_VALUE
    >>> b = Class2((i for i in range(3)))
    >>> dis.dis(b.__iter__)
      6           0 LOAD_FAST                0 (self)
                  2 LOAD_ATTR                0 (gen)
                  4 GET_YIELD_FROM_ITER
                  6 LOAD_CONST               0 (None)
                  8 
                 10 POP_TOP
                 12 LOAD_CONST               0 (None)
                 14 RETURN_VALUE
    

    Notably, the yield from version has GET_YIELD_FROM_ITER

    If TOS is a generator iterator or coroutine object it is left as is. Otherwise, implements TOS = iter(TOS).

    (subtly, YIELD_FROM keyword appears to be removed in 3.11)

    So if the given iterable (to the class) is a generator iterator, it'll be handed off directly, giving the result we (might) expect


    Extras

    Passing an iterator which isn't a generator (iter() creates a new iterator each time in both cases)

    >>> a = Class1([i for i in range(3)])
    >>> next(iter(a))
    0
    >>> next(iter(a))
    0
    >>> b = Class2([i for i in range(3)])
    >>> next(iter(b))
    0
    >>> next(iter(b))
    0
    

    Expressly closing Class1's internal generator

    >>> g = (i for i in range(3))
    >>> a = Class1(g)
    >>> next(iter(a))
    0
    >>> next(iter(a))
    1
    >>> a.gen.close()
    >>> next(iter(a))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    generator is only closed by iter when deleted if instance is popped

    >>> g = (i for i in range(10))
    >>> b = Class2(g)
    >>> i = iter(b)
    >>> next(i)
    0
    >>> j = iter(b)
    >>> del(j)        # next() not called on j
    >>> next(i)
    1
    >>> j = iter(b)
    >>> next(j)
    2
    >>> del(j)        # generator closed
    >>> next(i)       # now fails, despite range(10) above
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration