grammarbisonjison

Reduce/reduce conflict in clike grammar in jison


I'm working on the clike language compiler using Jison package. I went really well until I've introduced classes, thus Type can be a LITERAL now. Here is a simplified grammar:

%lex

%%
\s+                       /* skip whitespace */

int                       return 'INTEGER'
string                    return 'STRING'
boolean                   return 'BOOLEAN'
void                      return 'VOID'
[0-9]+                    return 'NUMBER'
[a-zA-Z_][0-9a-zA-Z_]*    return 'LITERAL'
"--"                      return 'DECR'
<<EOF>>                   return 'EOF'
"="                       return '='
";"                       return ';'


/lex

%%

Program
  : EOF
  | Stmt EOF
  ;

Stmt
  : Type Ident ';'
  | Ident '=' NUMBER ';'
  ;

Type
  : INTEGER
  | STRING
  | BOOLEAN
  | LITERAL
  | VOID
  ;

Ident
  : LITERAL
  ;

And the jison conflict:

Conflict in grammar: multiple actions possible when lookahead token is LITERAL in state 10
- reduce by rule: Ident -> LITERAL
- reduce by rule: Type -> LITERAL
Conflict in grammar: multiple actions possible when lookahead token is = in state 10
- reduce by rule: Ident -> LITERAL
- reduce by rule: Type -> LITERAL

States with conflicts:
State 10
  Type -> LITERAL . #lookaheads= LITERAL =
  Ident -> LITERAL . #lookaheads= LITERAL =

I've found quite a similar question that has no been answered, does any one have any clue how to solve this?


Solution

  • That's evidently a bug in jison, since the grammar is certainly LALR(1), and is handled without problems by bison. Apparently, jison is incorrectly computing the lookahead for the state in which the conflict occurs. (Update: It seems to be bug 205, reported in January 2014.)

    If you ask jison to produce an LR(1) parser instead of an LALR(1) grammar, then it correctly computes the lookaheads and the grammar passes without warnings. However, I don't think that is a sustainable solution.

    Here's another work-around. The Decl and Assign productions are not necessary; the "fix" was to remove LITERAL from Type and add a separate production for it.

    Program
      : EOF
      | Stmt EOF
      ;
    
    Decl
      : Type Ident ';'
      | LITERAL Ident ';'
      ;
    
    Assign
      : Ident '=' NUMBER ';'
      ;
    
    Stmt
      : Decl
      | Assign
      ;
    
    Type
      : INTEGER
      | STRING
      | BOOLEAN
      | VOID
      ;
    
    Ident
      : LITERAL
      ;
    

    You might want to consider recognizing more than one statement:

    Program
      : EOF
      | Stmts EOF
      ;
    
    Stmts
      : Stmt
      | Stmts Stmt
      ;