I am trying to convert an EBNF file into working BNF for IntelliJ Grammar-kit.
In the EBNF there are rules as such:
BinOpChar ::= "~" | "!" | "@" | "#" | "$" | "%" | "^" | "&" | "*" | "-"
BinOp ::= BinOpChar, {BinOpChar}
How can I create such rules without resorting to regex? Reason being is that this kind of construct happens very often and it becomes repetitive to do in regex.
To be clear, I would like to be able to create a rule to match @@
from BinOpChar
but don't match @ @
. Is that possible?
The easiest way is to list each operator independently:
{
tokens=[
//...
op_1='~'
op_2='!'
op_3='@'
op_4='@@'
op_5='#'
//...
]
}
If you really want to accept all n + n^2 tokens, you will need to use a regular expression:
{
tokens=[
//...
bin_op:'regexp:[~!@#]{1,2}'
//...
]
}
But the idea is, you want to use the lexer to define tokens, and the grammar to define expressions and so forth. So in the grammar if you write:
{
tokens=[
space='regexp:\s+'
]
}
BinOp ::= BinOpChar [BinOpChar]
BinOpChar ::= "~" | "!" | "@" | "#"
Then it's going to accept @@
and @ @
. Does that make sense?