parsingpegarpeggio

Arpeggio can't go back after a catch


Here's a simple code to understand:

def line(): return _(r".+")
def start(): return [line, (line, line)], EOF

parser = ParserPython(start, debug=True)

input_expr = """
A
B
"""

parse_tree = parser.parse(input_expr)

Here, in the rule start, it should try first to catch a line and if it doesn't work, try to catch two lines instead. But looks like Arpeggio doesn't have that capability. And I get arpeggio.NoMatch: Expected EOF at position (3, 1) => ' A *B '.


Solution

  • Arpeggio is based on PEG formalism and it never backtracks a successful ordered choice match.

    A quote from the Wikipedia PEG article:

    The fundamental difference between context-free grammars and parsing expression grammars is that the PEG's choice operator is ordered. If the first alternative succeeds, the second alternative is ignored.

    So you have to be careful when ordering your RHS rule references in an ordered choice. The rule of the thumb would be to put more specific matches at the front. In your case line line is more specific and should be tried first.