I'm getting the following error:
Warning : *** Shift/Reduce conflict found in state #2
between ExitLoopStatement ::= EXITLOOP (*)
and VarAccess ::= (*) DOLLAR IDENTIFIER
under symbol DOLLAR
This is the grammar I use. Currently I have assignments and break (each break statement can specify the number of loop levels to exit from).
start with StatementList;
StatementList
::= Statement
|
StatementList Statement
;
Statement
::= AssignmentStatement
|
ExitLoopStatement
;
AssignmentStatement
::= VarAccess EQ_OP VarAccess
;
VarAccess
::= DOLLAR IDENTIFIER
;
ExitLoopStatement
::= EXITLOOP
|
EXITLOOP VarAccess
;
From what I understand, the problem is that after I've encountered EXITLOOP, the next token can be the start of an assignment or the number of levels to exit from.
Is there any way to solve this problem?
Thanks!
In effect, if your language allows a statement to optionally end with something which might start another statement and the language does not require explicit punctuation between statements, then you will not be able to parse it with a limited lookahead parser.
In this extremely simple case, you could parse your language with an LR(3) parser, which means that an LR(1) grammar exists (although it will not be very readable). However, as you add more syntax to the language, you will probably find that the required lookahead starts to increase, and even becomes unbounded. (This will occur, for example, if you implement arrays and allow subscripts to be arbitrary expressions.)
As long as the language is not actually ambiguous, you could still parse it with a GLR parser. Unfortunately, CUP does not provide this option, and I am not sure that bison's Java framework allows it either.
On the whole, the simplest solution is to insist on statement separators/terminators (like ; in Java) or remove optional trailing syntaxes which could be confused with the start of statements. For example, if you change exit $x
to exit ($x)
and don't allow statements to begin with ( then you will not have a parsing conflict.