I'm trying to write a specification file for sablecc for a version of minipython (with postfix/prefix increment and decrement operators), and some productions naturally need to use identifiers, but i get these conflicts during parsing:
shift/reduce conflict in state [stack: TPrint TIdentifier *] on TPlusPlus in {
[ PMultiplication = TIdentifier * ] followed by TPlusPlus (reduce),
[ PPostfix = TIdentifier * TPlusPlus ] (shift)
}
shift/reduce conflict in state [stack: TPrint TIdentifier *] on TMinusMinus in {
[ PMultiplication = TIdentifier * ] followed by TMinusMinus (reduce),
[ PPostfix = TIdentifier * TMinusMinus ] (shift)
}
shift/reduce conflict in state [stack: TPrint TIdentifier *] on TLPar in {
[ PFunctionCall = TIdentifier * TLPar PArglist TRPar ] (shift),
[ PFunctionCall = TIdentifier * TLPar TRPar ] (shift),
[ PMultiplication = TIdentifier * ] followed by TLPar (reduce)
}
shift/reduce conflict in state [stack: TPrint TIdentifier *] on TLBr in {
[ PExpression = TIdentifier * TLBr PExpression TRBr ] (shift),
[ PMultiplication = TIdentifier * ] followed by TLBr (reduce),
[ PPostfix = TIdentifier * TLBr PExpression TRBr TMinusMinus ] (shift),
[ PPostfix = TIdentifier * TLBr PExpression TRBr TPlusPlus ] (shift)
}
java.lang.RuntimeException:
I started by following a given bnf of the language and got to this. Here is the grammar file:
Productions
goal = {prgrm}program* ;
program = {func}function | {stmt}statement;
function = {func}def identifier l_par argument? r_par semi statement ;
argument = {arg} identifier assign_value? subsequent_arguments* ;
assign_value = {assign} eq value ;
subsequent_arguments = {more_args} comma identifier assign_value? ;
statement = {case1}tab* if comparison semi statement
| {case2}tab* while comparison semi statement
| {case3}tab* for [iterator]:identifier in [collection]:identifier semi statement
| {case4}tab* return expression
| {case5}tab* print expression more_expressions
| {simple_equals}tab* identifier eq expression
| {add_equals}tab* identifier add_eq expression
| {minus_equals}tab* identifier sub_eq expression
| {div_equals}tab* identifier div_eq expression
| {case7}tab* identifier l_br [exp1]:expression r_br eq [exp2]:expression
| {case8}tab* function_call;
comparison = {less_than} comparison less relation
| {greater_than} comparison great relation
| {rel} relation;
relation = {relational_value} relational_value
| {logic_not_equals} relation logic_neq relational_value
| {logic_equals} relation logic_equals relational_value;
relational_value = {expression_value} expression_value
| {true} true
| {false} false;
expression = {case1} arithmetic_expression
| {case2} prefix
| {case4} identifier l_br expression r_br
| {case9} l_br more_values r_br;
more_expressions = {more_exp} expression subsequent_expressions*;
subsequent_expressions = {more_exp} comma expression;
arithmetic_expression = {plus} arithmetic_expression plus multiplication
| {minus} arithmetic_expression minus multiplication
| {multiplication} multiplication ;
multiplication = {expression_value} expression_value
| {div} multiplication div expression_value
| {mult} multiplication mult expression_value;
expression_value = {exp} l_par expression r_par
| {function_call} function_call
| {value} value
| {identifier} identifier ;
prefix = {pre_increment} plus_plus prepost_operand
| {pre_decrement} minus_minus prepost_operand
| {postfix} postfix;
postfix = {post_increment} prepost_operand plus_plus
| {post_decrement} prepost_operand minus_minus;
prepost_operand = {value} identifier l_br expression r_br
| {identifier} identifier;
function_call = {args} identifier l_par arglist? r_par;
arglist = {arglist} more_expressions ;
value = {number} number
| {string} string ;
more_values = {more_values} value subsequent_values* ;
subsequent_values = comma value ;
number = {int} numeral
| {float} float_numeral ;
where identifier is of course a token, and the problematic productions where it can be found are function_call, prepost_operand, expression_value. I experimentally removed prefix/postfix and prepost_operand to see if the conflicts would at least change a little, but that just leaves the two last conflicts. Is there any way i can resolve these conflicts without changing the grammar much, or have i gone down a completely wrong path?
The problem is the production whose right-hand side is:
print expression more_expressions
more_expressions
matches a list of expressions (so it probably should be called expression_list
to be less confusing). Two consecutive expression
s in a rule is obviously ambiguous (if you could have two expressions, would 1+1+1
be 1+1
followed by +1
or 1
followed by +1+1
?). What you want is just
print more_expressions