trying to use ANTLR 4 to create a simple grammar for some Select statements in Oracle DB. And faced a small problem. I have the following grammar:
Grammar & Lexer
column
: (tableAlias '.')? IDENT ((AS)? colAlias)?
| expression ((AS)? colAlias)?
| caseWhenClause ((AS)? colAlias)?
| rankAggregate ((AS)? colAlias)?
| rankAnalytic colAlias
;
colAlias
: '"' IDENT '"'
| IDENT
;
rankAnalytic
: RANK '(' ')' OVER '(' queryPartitionClause orderByClause ')'
;
RANK: R A N K;
fragment A:('a'|'A');
fragment N:('n'|'N');
fragment R:('r'|'R');
fragment K:('k'|'K');
The most important part there is in COLUMN declaration rankAnalytic part. I declared that after Rank statement should be colAlias, but in case this colAlias is called like "rank" (without quotes) it's recognized as a RANK lexer rule, but not as colAlias.
So for example in case I have the following text:
SELECT fulfillment_bundle_id, SKU, SKU_ACTIVE, PARENT_SKU, SKU_NAME, LAST_MODIFIED_DATE,
RANK() over (PARTITION BY fulfillment_bundle_id, SKU, PARENT_SKU
order by ACTIVE DESC NULLS LAST,SKU_NAME) rank
"rank" alias will be underlined and marked as an mistake with the following error:
mismatched input 'rank' expecting {'"', IDENT}
But the point is that I don't want it to be recognized as a RANK lexer word, but only rank as an alias for Column.
Open for your suggestions :)
The RANK
rule apparently appears above the IDENT
rule, so the string "rank" will never be emitted by the lexer as an IDENT
token.
A simple fix is to change the colAlias
rule:
colAlias
: '"' ( IDENT | RANK ) '"'
| ( IDENT | RANK )
;
OP added:
Ok but in case I have not only RANK as a lexer rule but the whole list (>100) of such key words... What am I supposed to do?
If colAlias
can be literally anything, then let it:
colAlias
: '"' .+? '"' // must quote if multiple
| . // one token
;
If that definition would incur ambiguities, a predicate is needed to qualify the match:
colAlias
: '"' m+=.+? '"' { check($m) }? // multiple
| o=. { check($o) }? // one
;
Functionally, the predicate is just another element in the subrule.