I am trying to write parsing in java cup but I get conflict
start with prog;
prog ::= header;
header ::= t1 t3 t2 | t3 t1 t2 | t2 t3 t1 | t1 t2 t3 | t2 t1 t3 | t3 t2 t1;
t1 ::= TOK1;
t2 ::= TOK2 | TOK2 TOK2 ;
t3 ::= | t3 TOK3;
I got this error;
Warning : *** Shift/Reduce conflict found in state #3
between t3 ::= (*)
and t1 ::= (*) TOK1
under symbol TOK1
Resolved in favor of shifting.
Can someone explain where I did mistake?
t3
can be empty (it matches a sequence of zero or more TOK3
tokens). That creates an ambiguity, because an input with no TOK3
s can match more than one of the alternatives for header
.
This is ambiguous despite the fact that all the possibilities reduce to the same non-terminal, because each production has a different reduction action, and the grammar has no way to know which action to perform. As noted in the conflict report, the parser will shift rather than reduce. For example, if the first input is TOK1
, the parser could assume the input will match t1 t2 t3
or t1 t3 t2
, in which case the sequence of TOK3
s will come later, or it could assume the input is t3 t1 t2
with an empty t3
, in which it must do the action for an empty t3
before proceeding with the parse. Choosing to shift means rejecting the possibility of an empty t3
at the beginning of the input.
In this particular case, the default shift is probably not going to cause problems. It means that an input consisting only of TOK1
and TOK2
will always be parsed with an empty t3
at the end, but if that's OK, then you can feel free to ignore the warning.
So-called "nullable non-terminals" are often the cause of shift-reduce conflicts precisely of this form. It's often useful to avoid them; instead of making possibly empty lists, a better solution is often to make the list non-terminal match one or more elements, and then make it optional by adding productions in which the list is absent. In other words, instead of
header: t3 t1 t2
t3 :
| t3 TOK3
use
header: t3 t1 t2
| t1 t2
t3 : TOK3
| t3 TOK3
That solution can lead to undesirable code duplication, but it's an easy way to solve shift-reduce conflicts due to nullable non-terminals.