I'm working on the clike language compiler using Jison package. I went really well until I've introduced classes, thus Type
can be a LITERAL
now. Here is a simplified grammar:
%lex
%%
\s+ /* skip whitespace */
int return 'INTEGER'
string return 'STRING'
boolean return 'BOOLEAN'
void return 'VOID'
[0-9]+ return 'NUMBER'
[a-zA-Z_][0-9a-zA-Z_]* return 'LITERAL'
"--" return 'DECR'
<<EOF>> return 'EOF'
"=" return '='
";" return ';'
/lex
%%
Program
: EOF
| Stmt EOF
;
Stmt
: Type Ident ';'
| Ident '=' NUMBER ';'
;
Type
: INTEGER
| STRING
| BOOLEAN
| LITERAL
| VOID
;
Ident
: LITERAL
;
And the jison conflict:
Conflict in grammar: multiple actions possible when lookahead token is LITERAL in state 10
- reduce by rule: Ident -> LITERAL
- reduce by rule: Type -> LITERAL
Conflict in grammar: multiple actions possible when lookahead token is = in state 10
- reduce by rule: Ident -> LITERAL
- reduce by rule: Type -> LITERAL
States with conflicts:
State 10
Type -> LITERAL . #lookaheads= LITERAL =
Ident -> LITERAL . #lookaheads= LITERAL =
I've found quite a similar question that has no been answered, does any one have any clue how to solve this?
That's evidently a bug in jison, since the grammar is certainly LALR(1), and is handled without problems by bison. Apparently, jison is incorrectly computing the lookahead for the state in which the conflict occurs. (Update: It seems to be bug 205, reported in January 2014.)
If you ask jison to produce an LR(1) parser instead of an LALR(1) grammar, then it correctly computes the lookaheads and the grammar passes without warnings. However, I don't think that is a sustainable solution.
Here's another work-around. The Decl
and Assign
productions are not necessary; the "fix" was to remove LITERAL
from Type
and add a separate production for it.
Program
: EOF
| Stmt EOF
;
Decl
: Type Ident ';'
| LITERAL Ident ';'
;
Assign
: Ident '=' NUMBER ';'
;
Stmt
: Decl
| Assign
;
Type
: INTEGER
| STRING
| BOOLEAN
| VOID
;
Ident
: LITERAL
;
You might want to consider recognizing more than one statement:
Program
: EOF
| Stmts EOF
;
Stmts
: Stmt
| Stmts Stmt
;