grammarbisonshift-reduce-conflict

S/R conflict in Bison


I have a problem with a [relatively simple] grammar, a distilled version of a much bigger grammar. Bison reports one shift/reduce conflict, which is resolved incorrectly: the resulting parser fails to parse valid language. I am not kidding: I intermittently spent more than a year trying to eliminate the conflict but failed. ChatGPT is uncooperative :) Please advise.

%token YY_WORD_NAME
%left YY_IN 

%%

word_exp:
  word_exp YY_IN set_exp 
| YY_WORD_NAME 

set_exp:
  '{' word_exp '}' 
| '{' YY_WORD_NAME YY_IN set_exp ':' word_exp '}'

%%

Solution

  • The basic problem you have here is lookahead -- in order to decide that a YY_WORD_NAME is not a word_exp, you need to look ahead through the following YY_IN set_expr (and unbounded number of tokens) to see the :.

    The easiest fix for this is probably to "relax" the grammar a bit (accept some constructs that are not legal), and then have a post-pass that flags the erroneous conditions. If you use:

    word_exp:
      word_exp YY_IN set_exp 
    | YY_WORD_NAME 
    
    set_exp:
      '{' word_exp '}' 
    | '{' word_exp ':' word_exp '}'
    

    That fixes the conflict, but accepts things that don't have exactly one YY_IN before the : like:

    { A in { B } in { C } : D }
    { A : B }
    

    (here I'm assuming YY_IN is an in keyword and YY_WORD_NAME is just a non-keyword identifier)

    You can even flag this as an error in the bison action immediately after the incorrect construct (rather than by building a data structure you later traverse, though you may be doing that anyway for other reasons)

    %token YY_WORD_NAME
    %left YY_IN 
    
    %union {
        struct {
            int     incnt;
            // any other info needed from word_exp
        } word_exp;
    }
    
    %type<word_exp> word_exp
    
    %%
    
    word_exp:
      word_exp YY_IN set_exp { $$.incnt = $1.incnt + 1; }
    | YY_WORD_NAME           { $$.incnt = 0; }
    
    set_exp:
      '{' word_exp '}' 
    | '{' word_exp ':' word_exp '}'  { if ($2.incnt != 1) { 
                                           yyerror("Syntax error");
                                           YYERROR; /* or YYABORT */
                                       }
                                     }
    
    %%