My input is in the form of pairs of comma-separated values, e.g.,
805i,9430 3261i,9418 3950i,9415 4581i,4584i 4729i,9421 6785i,9433 8632i,9434 9391i,9393i
and I want to read them into a list of pair of string. The below does the job for a given line in open(<filename>,'r')
bs = line.strip().split()
bss = []
for b in bs :
x, y = b.split(',')
bss.append((x,y))
However is there a way I can do this in one line with a list comprehension? Note: that I could do [(b.split(',')[0], b.split(',')[1]) for b in bs]
, but this unnecessarily calls the split
function twice.
You can use an assignment expression to hold some partial state.
bss = [(v[0], v[1]) for b in bs if (v := b.split(','))]
Another alternative is to use a nested generator expression to create the value.
bss = [(v[0], v[1]) for v in (b.split(',') for b in bs)]
If you know there are always two values, then you can simply write:
bss = [(x, y) for x, y in (b.split(',') for b in bs)]
For the final application, I would add an additional empty line/entry check. At that point, its best to not jam this all on one line.
The following example uses generator variables (assigned from generator expressions), which provide an efficient way to break up large comprehensions without the memory overhead you would have when storing temporary computations in a standard container.
with open("data.text") as f:
pairs = (word.split(',') for line in f for word in line.split())
bss = [tuple(pair) for pair in pairs if len(pair) == 2]
print(bss)