antlrantlr4

antlr4 lack of whitespace affecting parse with -> skip


The below antlr4 grammar works as expected when parsing

"3 + 2 * 5"

but fails to parse

"3 + 2*5"

reporting "1:5 extraneous input '*5' expecting {, WS}". I'm skipping whitespace with "-> skip", so I don't understand why the behavior is different. What am I missing?

I'm using antlr 4.7.2, but the same behavior can be seen on lab.antlr.org.

grammar Expr;

top : expr EOF
    ;

expr
    : '(' expr ')'     # ParenExpr
    | left=expr op=('*' | '/') right=expr    # BinExpr
    | left=expr op=('+' | '-') right=expr    # BinExpr
    | atom     # Atom
    | f=expr '(' arg=expr ')'   # FunCallExpr
    ;

atom
    : IDENTIFIER | NUMBER | HEXINT
    ;

HEXINT: '0' 'x' [0-9a-fA-F]+
    ;

IDENTIFIER: [A-Za-z_][a-zA-Z0-9_]*
    ;


NUMBER
    : [0-9]*.[0-9]+ ('e' [-+]? [0-9]+)?
    | [0-9]+
    ;

WHITESPACE: [ \r\n\t]+ -> skip;

Solution

  • That is because *5 is tokenized as a NUMBER. The . matches any character, so it should not be .[0-9]+ but '.' [0-9]+