pythonnumbersline-by-linespecification-pattern

how can a find a patter of numbers in consecutive lines with python?


Im learning python but i have some problems with my scripts yet.

I have a file similar to:

1 5
2 5
3 5
4 2
5 1
6 7
7 7
8 8

I want to print the pairs of numbers 2-1 in consecutive lines, just taking the column 2 to find them, and then, print the column 1 and 2 with the results. The result will be similar to this:

4 2 
5 1 

I'm trying to do it with python, because my file has 4,000,000 data. So, this is my script:

import linecache

final_lines = []
with open("file.dat") as f:
for i, line in enumerate(f, 1):
    if "1" in line:
        if "2" in linecache.getline("file.dat", i-1):
            linestart = i - 1 
            final_lines.append(linecache.getline("file.dat", linestart))
print(final_lines)

and the result is:

['2\n', '2\n', '2\n']

What I must to change in my script to fit the result that I want?, Can you guide me please? Thanks a lot.


Solution

  • would work i think

    import re
    with open("info.dat") as f:
       for match in re.findall("\d+ 2[\s\n]*\d+ 1",f.read()):
           print match
    

    see also : https://repl.it/repls/TatteredViciousResources

    another alternative is

    lines = f.readlines()
    for line,nextline in zip(lines,lines[1:]):
        if line.strip().endswith("2") and nextline.strip().endswith("1"):
           print(line+nextline)