javagrammarpetitparser

How to get the associativity correct when parsing non-symmetric binary operators with PetitParser?


I'm trying to make a basic math parser with PetitParser, and I fail to get the order correct with non-symmetric binary operator like subtraction or division.

I have this small example that can only parse (non-negative) integers and the - binary operator, and emits a string with the same parsed expression with parenthesis (so that I can see the associativity):

import java.util.List;

import org.petitparser.parser.Parser;
import org.petitparser.parser.combinators.SettableParser;

import static org.petitparser.parser.primitive.CharacterParser.*;

public class App {
    public static void main(String[] args) {
        Parser number = digit().plus().flatten().trim();

        SettableParser term = SettableParser.undefined();
        term.set(number.seq(of('-').flatten().trim()).seq(term).map((List<String> values) -> {
            return String.format("(%s - %s)", values.get(0), values.get(2));
        }).or(number));

        Parser expression = term.end();

        System.out.println(expression.parse("1 - 2 - 3").<String>get());
    }
}

This prints (1 - (2 - 3)) - though the correct associativity for 1 - 2 - 3 is ((1 - 2) - 3).

Now, I understand that my grammer is this:

number: [0-9]+
term: number '-' term
expression: number $

So ((1 - 2) - 3) is term '-' number. But when I try to switch them:

        term.set(term.seq(of('-').flatten().trim()).seq(number).map((List<String> values) -> {
            return String.format("(%s - %s)", values.get(0), values.get(2));
        }).or(number));

I run into an endless recursion:

:runException in thread "main" java.lang.StackOverflowError
        at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:22)
        at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:29)
        at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
        at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:24)
        at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
        at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:29)
        at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
        at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:24)
        at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
        at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:29)
        at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
        at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:24)
        at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
        at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:29)
        at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
        at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:24)
        at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
        at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:29)
        at org.petitparser.parser.combinators.ChoiceParser.parseOn(ChoiceParser.java:22)
        at org.petitparser.parser.combinators.DelegateParser.parseOn(DelegateParser.java:24)
        at org.petitparser.parser.combinators.SequenceParser.parseOn(SequenceParser.java:25)
        at org.petitparser.parser.actions.ActionParser.parseOn(ActionParser.java:29)
        ........

So... how can I parse the expression the way it should be parsed?

UPDATE

As per @rici's suggestion, I've changed it to use ExpressionBuilder:

import java.util.List;

import org.petitparser.parser.Parser;
import org.petitparser.parser.combinators.SettableParser;
import org.petitparser.tools.ExpressionBuilder;

import static org.petitparser.parser.primitive.CharacterParser.*;

public class App {
    public static void main(String[] args) {
        Parser number = digit().plus().flatten().trim();

        SettableParser term = SettableParser.undefined();

        ExpressionBuilder builder = new ExpressionBuilder();
        builder.group().primitive(number);
        builder.group().left(of('-').trim(), (List<String> values) -> {
            return String.format("(%s - %s)", values.get(0), values.get(2));
        });

        term.set(builder.build());
        Parser expression = term.end();
        System.out.println(expression.parse("1 - 2 - 3"));
    }
}

By using left() or right() I can choose the associativity of the binary operators.


Solution

  • Top-down parsers cannot handle left-recursion, and you cannot write a BNF grammar for a left-associative expression grammar without left-recursion. So what to do? (Other than switch to a bottom-up parsing methodology.)

    One simple possibility, if the parsing framework supports it, is to use repetition to parse a sequence of similar operators, using a grammar which would look something like:

    term: factor ( ('-' | '+') factor)*
    factor: number ( ( '*' | '/') number)*
    

    Then you can apply whatever associativity you like to the list which is produced by the parse.

    That's a degenerate case of the more general solution of writing a simple shunting-yard processor (simple because it need not deal with parentheses). You might want this solution if you want to be able to define new operators (with precedence and associativity) at run-time.

    With PetitParser, the easiest solution is probably to use the included ExpressionBuilder. See https://github.com/petitparser/java-petitparser/blob/master/petitparser-core/src/main/java/org/petitparser/tools/ExpressionBuilder.java