dartpetitparser

Why does *any* not backtrack in this example?


I'm trying to understand why in the following example I do not get a match on f2. Contrast that with f1 which does succeed as expected.

import 'package:petitparser/petitparser.dart'; import 'package:petitparser/debug.dart';

main() {
  showIt(p, s, [tag = '']) {
    var result = p.parse(s);
    print('''($tag): $result ${result.message}, ${result.position} in:
$s
123456789123456789
''');
  }
  final id = letter() & word().star();
  final f1 = id & char('(') & letter().star() & char(')');
  final f2 = id & char('(') & any().star() & char(')');
  showIt(f1, 'foo(a)', 'as expected');
  showIt(f2, 'foo(a)', 'why any not matching a?');
  final re1 = new RegExp(r'[a-zA-Z]\w*\(\w*\)');
  final re2 = new RegExp(r'[a-zA-Z]\w*\(.*\)');
  print('foo(a)'.contains(re1));
  print('foo(a)'.contains(re2));
}

The output:

(as expected): Success[1:7]: [f, [o, o], (, [a], )] null, 6 in:
foo(a)
123456789123456789

(why any not matching a?): Failure[1:7]: ")" expected ")" expected, 6 in:
foo(a)
123456789123456789

true
true

I'm pretty sure the reason has to do with the fact that any matches the closing paren. But when it then looks for the closing paren and can't find it, shouldn't it:

Also in contrast I show the analagous regexps that do this.


Solution

  • As you analyzed correctly, the any parser in your example consumes the closing parenthesis. And the star parser wrapping the any parser is eagerly consuming as much input as possible.

    Backtracking as you describe is not automatically done by PEGs (parsing expression grammars). Only the ordered choice backtracks automatically.

    To fix your example there are multiple possibilities. Most strait forward one is to not make any match the closing parenthesis:

    id & char('(') & char(')').neg().star() & char(')')
    

    or

    id & char('(') & pattern('^)').star() & char(')')
    

    Alternatively you can use the starLazy operator. Its implementation is using the star and ordered choice operators. An explanation can be found here.

    id & char('(') & any().starLazy(char(')')) & char(')')