dartpetitparser

Why is the ordered choice ignored of toChoiceParser() when adding a plus() parser?


I am stuck at one point with the Dart package petitparser: It seems that the "priority rule" ("parse p1, if that doesn't work parse p2 - ordered choice") is ignored by the toChoiceParser() if a plus() parser is added.

import 'package:petitparser/petitparser.dart';

// This parser should check from left to right if a nestedTerm, e.g. '(0)' or '(()', exists.
// If this is not the case, then it looks if a singleCharacter exists, either '(', ')' or '0' (lower priority).
// In case 1 everything works perfectly. But if the process is repeated any number of times, as in case 2,
// then it seems that it no longer recognizes that a nestedTerm exists and that this should actually lead
// to the same terminal output as in case 1 due to the higher priority. Where is my fallacy?

void main() {
  final definition = ExpressionDefinition();
  final parser = definition.build();
  print(parser.parse('(0)').toString());
  // Terminal output in case 1: ['(' (nestedTerm), '0' (singleCharacter), ')' (nestedTerm)]
  // Terminal output in case 2: ['(' (singleCharacter), '0' (singleCharacter), ')' (singleCharacter)]
}

class ExpressionDefinition extends GrammarDefinition {
  @override
  Parser start() => ref0(term).end();
  // Case 1 (parses only once):
  Parser term() => ref0(nestedTerm) | ref0(singleCharacter);
  // Case 2 (parses one or more times):
     // Parser term() => (ref0(nestedTerm) | ref0(singleCharacter)).plus();
  Parser nestedTerm() =>
      (char('(')).map((value) => "'$value' (nestedTerm)") &
      ref0(term) &
      char(')').map((value) => "'$value' (nestedTerm)");
  Parser singleCharacter() =>
      char('(').map((value) => "'$value' (singleCharacter)") |
      char(')').map((value) => "'$value' (singleCharacter)") |
      char('0').map((value) => "'$value' (singleCharacter)");
}

However, for my current project, the "priority rule" should also work in this case (in this example case 2).

Can anyone find my fallacy? Thanks a lot for your support!


Solution

  • Probably the easiest way to understand what is going on is to compare the parse trace of the two parsers, see also the section on debugging grammars I recently added:

    import 'package:petitparser/debug.dart';
    
    void main() {
      ...
      trace(parser).parse('(0)');
    

    You will see that in case 2 the nested-term is correctly started, but then for the inside of the nested-term the plus() parser eagerly consumes the remaining input characters 0 and ). This then causes the outer nested-term to fail because it cannot be completed with a ) anymore. As a consequence the complete input is consumed using single-characters.

    From the examples given it is not entirely clear what you expect to get? Removing char(')') from the singleCharacter parser would solve issue described.