pythongeneratorlanguage-concepts

Where are the 'elements' being stored in a generator?


The below code summarises all the numbers in the list held in all_numbers. This makes sense as all the numbers to be summarised are held in the list.

def firstn(n):
    '''Returns list number range from 0 to n '''
    num, nums = 0, []
    while num < n:
        nums.append(num)
        num += 1
    return nums

# all numbers are held in a list which is memory intensive
all_numbers = firstn(100000000)
sum_of_first_n = sum(all_numbers)

# Uses 3.8Gb during processing and 1.9Gb to store variables
# 13.9 seconds to process
sum_of_first_n 

When converting the above function to a generator function, I find I get the same result with less memory used (below code). What I don't understand is how can all_numbers be summarised if it doesn't contain all the numbers in a list like above?

If the numbers are being generated on demand then one would have generate all numbers to summarise them all together, so where are these numbers being stored and how does this translate to reduced memory usage?

def firstn(n):
    num = 0
    while num < n:
        yield num
        num += 1

# all numbers are held in a generator
all_numbers = firstn(100000000)
sum_of_first_n = sum(all_numbers)

# Uses < 100Mb during processing and to store variables
# 9.4 seconds to process
sum_of_first_n

I understand how to create a generator function and why you would want to use them but I don't understand how they work.


Solution

  • A generator do not store the values, you need to think of a generator as a function with context, it will save it state and GENERATE the values each time it is asked to do so, so, it gives you a value, then "discard" it, hold the context of the computation and wait till you ask for more; and will do so until the generator context is exhausted.

    def firstn(n):
        num = 0
        while num < n:
            yield num
            num += 1
    

    In this example you provide, the "only" memory used is num, is where the computation is stored, the firstn generator holds the num in its context till the while loop is finised.