pythonregexregex-greedy

Regex: Match brackets both greedy and non greedy


I'm using python regular expression module, re .

I need to match anything inside '(' ')' on this two phrases, but "not so greedy". Like this:

show the (name) of the (person)

calc the sqrt of (+ (* (2 4) 3))

The result should return, from phrase 1:

name
person

The result should return from phrase 2:

+ (* (2 4) 3)

The problem is that, to fit first phrase, I used '\(.*?\)'

This, on second phrase, just fits + (* (2 4)

And using '\(.*\)' to fit second phrase correctly, on first phrase fits (name) of the (person)

What regex work on both phrases correctly?


Solution

  • Pyparsing makes it easy to write simple one-off parsers for stuff like this:

    >>> text = """show the (name) of the (person)
    ...
    ... calc the sqrt of (+ (* (2 4) 3))"""
    >>> import pyparsing
    >>> for match in pyparsing.nestedExpr('(',')').searchString(text):
    ...   print match[0]
    ...
    ['name']
    ['person']
    ['+', ['*', ['2', '4'], '3']]
    

    Note that the nesting parens have been discarded, and the nested text returned as a nested structure.

    If you want the original text for each parenthetical bit, then use the originalTextFor modifier:

    >>> for match in pyparsing.originalTextFor(pyparsing.nestedExpr('(',')')).searchString(text):
    ...   print match[0]
    ...
    (name)
    (person)
    (+ (* (2 4) 3))