Consider the code:
def test(data):
for row in data:
print("first loop")
for row in data:
print("second loop")
When data
is an iterator, for example a list iterator or a generator expression*, this does not work:
>>> test(iter([1, 2]))
first loop
first loop
>>> test((_ for _ in [1, 2]))
first loop
first loop
This prints first loop
a few times, since data
is non-empty. However, it does not print second loop
. Why does iterating over data
work the first time, but not the second time? How can I make it work a second time?
Aside from for
loops, the same problem appears to occur with any kind of iteration: list/set/dict comprehensions, passing the iterator to list()
, sum()
or reduce()
, etc.
On the other hand, if data
is another kind of iterable, such as a list
or a range
(which are both sequences), both loops run as expected:
>>> test([1, 2])
first loop
first loop
second loop
second loop
>>> test(range(2))
first loop
first loop
second loop
second loop
* More examples:
filter
, map
, and zip
objects (in 3.x)enumerate
objectscsv.reader
sitertools
standard libraryFor general theory and terminology explanation, see What are iterator, iterable, and iteration?.
To detect whether the input is an iterator or a "reusable" iterable, see Ensure that an argument can be iterated twice.
An iterator can only be consumed once. For example:
data = [1, 2, 3]
it = iter(data)
next(it)
# => 1
next(it)
# => 2
next(it)
# => 3
next(it)
# => StopIteration
When the iterator is supplied to a for
loop instead, that last StopIteration
will cause it to exit the first time. Trying to use the same iterator in another for loop will cause StopIteration
again immediately, because the iterator has already been consumed.
A simple way to work around this is to save all the elements to a list, which can be traversed as many times as needed. For example:
data = list(it)
If the iterator would iterate over many elements at roughly the same time, however, it's a better idea to create independent iterators using tee()
:
import itertools
it1, it2 = itertools.tee(data, 2) # create as many as needed
Now each one can be iterated over separately:
next(it1)
# => 1
next(it1)
# => 2
next(it2)
# => 1
next(it2)
# => 2
next(it1)
# => 3
next(it2)
# => 3