antlrantlr4

Clarifying a parsing ambiguity in Java-like expressions with ANTLR 4


I am writing a parser for a subset of Java-like expressions, and have found an ambiguity when parsing this code:

(something) -x

since it could be parsed as something - x (binary expression) or (something) (-x) (casted unary expression). Currently the parser favours the first option but I would like the second one. The relevant parts of my parser look like this:

expression
   : LeftParen expr = expression RightParen # ParenthesizedExpression
   | id = Identifier # IdentifierExpression
   | op = (Minus | BitwiseNot) expr = expression # UnaryExpression
   | < assoc = right > LeftParen type = Identifier RightParen expr = expression # CastExpression
   | left = expression op = (Plus | Minus) right = expression # AdditiveExpression
   ;

Any ideas how to resolve this, ideally without eliminating the direct left-recursion? Thanks.


Solution

  • The alternatives in an ANTLR parser rule are tried from top to bottom. If you move your cast expression up, it will parse as expected:

    expression
     : <assoc=right> LeftParen type=Identifier RightParen expr=expression   # CastExpression
     | LeftParen expr=expression RightParen                                 # ParenthesizedExpression
     | op=(Minus | BitwiseNot) expr=expression                              # UnaryExpression
     | left=expression op=(Plus | Minus) right=expression                   # AdditiveExpression
     | id=Identifier                                                        # IdentifierExpression
     ;
    

    enter image description here