parsingantlr4grun

antlr4 matching of choice expression


Im writing chrome DEPS file parser. How to match one of either following grammar rule defintion of rightexpr. My grammar is like the following one:

grammar Depsgrammar;

prog:   expr+ EOF;

expr:   varline
   ;
   
varline:  
   ID EQ  rightexpr  
    ;

rightexpr :  
    basicvalue | bentukonejsonval| bentuktwojsonval  
   ;
   
bentukonejsonval :
   '[' string?  (COMMA string )* COMMA? ']'
    ;
   
bentuktwojsonval :
    '{' singledictexpr?  (COMMA singledictexpr )* COMMA? '}'
   ;
   
singledictexpr :
    string ':' basicvalue
   ;
   
basicvalue :
 True
| False
| string
| NUM
| varfunc
;


varfunc :
 Var '(' string ')'
 ;

string :
 SIMPLESTRINGEXPRDOUBLEQUOTE
| SIMPLESTRINGEXPRSINGLEQUOTE
;

Var : 'Var' ;
COMMA : ',' ;
NUM : [0-9]+;
ID : [a-zA-Z0-9_]+;
True : [tT] [Rr] [Uu] [Ee];
False: [Ff] [Aa] [Ll] [Ss] [Ee];

fragment SIMPLESTRINGEXPRDOUBLEQUOTEBASE :   ~ ( '\n' | '\r' | '"' )*  ;
SIMPLESTRINGEXPRDOUBLEQUOTE: '"' SIMPLESTRINGEXPRDOUBLEQUOTEBASE '"' ;
fragment SIMPLESTRINGEXPRSINGLEQUOTEBASE :   ~ ( '\n' | '\r' | '\'' )* ;
SIMPLESTRINGEXPRSINGLEQUOTE : '\'' SIMPLESTRINGEXPRSINGLEQUOTEBASE '\'' ;
EQ : '=';
COMMENT:
  '#' ~ ( '\n' | '\r' )* '\n' -> skip ;

WS : [ \n\t\r]+ -> skip ;

I want user could enter this input

#adas21 #FS;SFD33
_as= Var('das') # somelongth comment
_as_0= FALSE # somelongth comment
_as_0= 'as!' # somelongth comment
gclient_gn_args = [

#ad as!~;
'checkout_libaom',
'checkout_nacl',
'"{cros_board}" == "amd64-generic"',
'checkout_oculus_sdk',
 ]


 vars = {
 'checkout_libaom':1,
 'checkout_nacl': "SS",
 'checkout_oculus_sdk': FalSe,
 'checkout_oculus_sdk':'',
  }

 s=[
 ]

whenever I enter simple syntax in grun

sa=true

always give me line 1:3 mismatched input 'true' expecting blah..(rightexpr def). I'm missing in understanding of basic antlr4 choice matching decision. Could you please teach me? Thanks


Solution

  • Whenever you have an error where the list of expected tokens seemingly includes the unexpected token, it is a good idea to list the generated tokens. You can do that by passing the -tokens option to grun. If you do this for your input, you'll see that true is interpreted as an ID token, not a True token.

    The reason for that is that when multiple lexer rules would match on the current input and produce a match of the same size, the one that's defined earlier in the grammar is chosen. So because ID is defined before True, it takes precedence. Generally all keywords should be defined before the ID rule to prevent exactly this issue.

    In other words, moving the True and False rules before ID will solve your issue.