I am trying to understand Python's iterators in the context of the pysam module. By using the fetch
method on a so called AlignmentFile class one get a proper iterator iter
consisting of records from the file file
. I can the use various methods to access each record (iterable), for instance the name with query_name
:
import pysam
iter = pysam.AlignmentFile(file, "rb", check_sq=False).fetch(until_eof=True)
for record in iter:
print(record.query_name)
It happens that records come in pairs so that one would like something like:
while True:
r1 = iter.__next__()
r2 = iter.__next__()
print(r1.query_name)
print(r2.query_name)
Calling next() is probably not the right way for million of records, but how can one use a for loop to consume the same iterator in pairs of iterables. I looked at the grouper recipe from itertools and the SOs Iterate an iterator by chunks (of n) in Python? [duplicate] (even a duplicate!) and What is the most “pythonic” way to iterate over a list in chunks? but cannot get it to work.
First of all, don't use the variable name iter
, because that's already the name of a builtin function.
To answer your question, simply use itertools.izip
(Python 2) or zip
(Python 3) on the iterator.
Your code may look as simple as
for next_1, next_2 in zip(iterator, iterator):
# stuff
edit: whoops, my original answer was the correct one all along, don't mind the itertools recipe.
edit 2: Consider itertools.izip_longest
if you deal with iterators that could yield an uneven amount of objects:
>>> from itertools import izip_longest
>>> iterator = (x for x in (1,2,3))
>>>
>>> for next_1, next_2 in izip_longest(iterator, iterator):
... next_1, next_2
...
(1, 2)
(3, None)