pythonrecursiongenerator

An explanation for the behavior of a basic tree-recursive generator


I am working on Q3 in discussion 6 of UCB CS61A. It asks us to implement a tree recursive generator. The answer:

def partition_gen(n, m):
    """Yield the partitions of n using parts up to size m.

    >>> for partition in sorted(partition_gen(6, 4)):
    ...     print(partition)
    1 + 1 + 1 + 1 + 1 + 1
    1 + 1 + 1 + 1 + 2
    1 + 1 + 1 + 3
    1 + 1 + 2 + 2
    1 + 1 + 4
    1 + 2 + 3
    2 + 2 + 2
    2 + 4
    3 + 3
    """
    if n == m:
        yield str(n)
    if n - m > 0:
        for p in partition_gen(n - m, m):
            yield p + ' + ' + str(m)
    if m > 1:
        yield from partition_gen(n, m-1)

A=partition_gen(6,4) and calling next(A) three times gets "2+4", "1+1+4" and "3+3".

The first next(A) returns "2+4". A sequence of recursive calls partiton_gen(2,4), partition_gen(2,3), partition_gen(2,2) is made and finally a base case n=m is reached.

For the next iteration the program should start where it stopped the last time and make a sequence of recursive calls again: partition_gen(2,4), partition_gen(2,3), partition_gen(2,2). Here partition_gen(2,2) is a new call and should generate a new generator. Thus, the program should yield a "2" again. But the program behaves as if this is just another call on the same generator and returns "1+1".

In the third next(A), when partition_gen(2,2) is called again, why does no StopIteration error arise?

For designing a recursive function, I assume that the function returns what I want for a smaller size problem and make sure my program returns the correct thing for a larger problem. But for a recursive generator, many extra efforts are needed to carefully keep track of each recursive call and monitor what would be yielded for each call. I guess this extra effort can be avoided.

How should one think about a tree recursive generator?


Solution

  • yield from ... is essentially just shorthand for

    for _ in ...
        yield _
    

    (I'm a little hesitant to say they are equivalent; there are probably some distinctions in how exceptions are handled.)

    The yield from expression has multiple yield points, so when the subiterator yields one value, it's still active when control resumes later, and it continues to yield values until the subiterator is exhausted, at which point the yield from ... expression finally completes.

    Without yield from (which was added to Python much later than the original yield statement), you would have written

    def partition_gen(n, m):
        if n == m:
            yield str(n)
        if n - m > 0:
            for p in partition_gen(n - m, m):
                yield p + ' + ' + str(m)
        if m > 1:
            for p in partition_gen(n, m-1):
                yield p
    

    The original uses one for loop because yield from doesn't provide a way to modify the value yielded by the subiterator before yielding it yourself, though you could write something clunky like

    yield from (p + ' + ' + str(m) for p in partition_gen(n - m, m))