antlrgrammartokenizeantlr4lexical-analysis

ANTLR 4 token rule that matches any characters until it encounters XYZ


I want a token rule that gobbles up all characters until it gets to the characters XYZ.

Thus, if the input is this:

helloXYZ

then the token rule should return this token:

hello

If the input is this:

Blah Blah XYZ

then the token rule should return this token:

Blah Blah

How do I define a token rule to do this?


Solution

  • Using the hint that Terrance gives in his answer, I think this is what Roger is looking for:

    grammar UseLookahead;
    
    parserRule : LexerRule;
    
    LexerRule : .+? { (_input.LA(1) == 'X') &&
                      (_input.LA(2) == 'Y') &&
                      (_input.LA(3) == 'Z') 
                    }?
              ;
    

    This gives the answers required, hello and Blah Blah respectively. I confess that I don't understand the significance of the final ?.