pythonregexparsing

python regex, match in multiline, but still want to get the line number


I have lots of log files, and want to search some patterns using multiline, but in order to locate matched string easily, I still want to see the line number for matched area.

Any good suggestion. (code sample is copied)

string="""
####1
ttteest
####1
ttttteeeestt

####2

ttest
####2
"""

import re
pattern = '.*?####(.*?)####'
matches= re.compile(pattern, re.MULTILINE|re.DOTALL).findall(string)
for item in matches:
    print "lineno: ?", "matched: ", item

[UPDATE] the lineno is the actual line number

So the output I want looks like:

    lineno: 1, 1
    ttteest
    lineno: 6, 2
    ttttteeeestt

Solution

  • You can store the line numbers before hand only and afterwards look for it.

    import re
    
    string="""
    ####1
    ttteest
    ####1
    ttttteeeestt
    
    ####2
    
    ttest
    ####2
    """
    
    end='.*\n'
    line=[]
    for m in re.finditer(end, string):
        line.append(m.end())
    
    pattern = '.*?####(.*?)####'
    match=re.compile(pattern, re.MULTILINE|re.DOTALL)
    for m in re.finditer(match, string):
        print 'lineno :%d, %s' %(next(i for i in range(len(line)) if line[i]>m.start(1)), m.group(1))