compiler-constructionbisonyacclexcontext-free-grammar

Issue with Bison Parser for Handling for Loops (Syntax Error)


I am writing a parser using a context-free grammar with Bison and Flex to handle a simple language that includes for loops. However, I'm encountering a syntax error when I try to parse valid for loop statements. Below are my .l and .y files, along with the output and error message I'm getting.

Flex File (for.l):

%option noyywrap

%{
#include "for2.tab.h"  // Include the header generated by Bison
%}

alpha [A-Za-z]
digit [0-9]

%%
[ \t\n]               ; // Ignore whitespace
for                   return FOR;
printf                return ID;
{digit}+              { yylval = atoi(yytext); return NUM; }
{alpha}({alpha}|{digit})* { return ID; }
\"([^\\\"]|\\.)*\"    { return STRING; }
"++"                  return INC;
"--"                  return DEC;
"<="                  return LE;
">="                  return GE;
"=="                  return EQ;
"!="                  return NE;
"||"                  return OR;
"&&"                  return AND;
[-+*/()=<>!;{},%]     return yytext[0];
.                     { fprintf(stderr, "Unrecognized character: %s\n", yytext); }

%%

Bison File (for.y):

%{
#include <stdio.h>
#include <stdlib.h>

int yylex(void);
void yyerror(const char *s) {
    fprintf(stderr, "Error: %s\n", s);
}
%}

%token ID NUM FOR STRING
%token LE GE EQ NE OR AND
%token INC DEC 

%left OR
%left AND
%left EQ NE
%left '<' '>' LE GE

%%

Program     : Statement                  { printf("Input accepted\n"); exit(0); }
            ;

Statement   : ForStatement
            | Expression ';'
            | FunctionCall ';'
            ;

ForStatement: FOR '(' AssignmentExpression ';' LogicalExpression ';' IncrementExpression ')' Block
            ;

Block       : '{' StatementList '}'
            | Statement
            | /* empty */
            ;

StatementList : StatementList Statement
              | Statement
              ;

Expression  : AssignmentExpression
            | ArithmeticExpression
            ;

AssignmentExpression
            : ID '=' ArithmeticExpression
            ;

ArithmeticExpression
            : ArithmeticExpression '+' Term
            | ArithmeticExpression '-' Term
            | Term
            ;

Term        : Term '*' Factor
            | Term '/' Factor
            | Factor
            ;

Factor      : '(' ArithmeticExpression ')'
            | ID
            | NUM
            | '-' Factor
            ;

IncrementExpression
            : ID INC
            | ID DEC
            | AssignmentExpression
            ;

LogicalExpression
            : LogicalExpression OR LogicalTerm
            | LogicalTerm
            ;

LogicalTerm : LogicalTerm AND LogicalFactor
            | LogicalFactor
            ;

LogicalFactor
            : RelationalExpression
            | '!' LogicalFactor
            | '(' LogicalExpression ')'
            ;

RelationalExpression
            : ArithmeticExpression '<' ArithmeticExpression
            | ArithmeticExpression '>' ArithmeticExpression
            | ArithmeticExpression LE ArithmeticExpression
            | ArithmeticExpression GE ArithmeticExpression
            | ArithmeticExpression EQ ArithmeticExpression
            | ArithmeticExpression NE ArithmeticExpression
            ;

FunctionCall: ID '(' ArgumentList ')'
            ;

ArgumentList: Expression
            | ArgumentList ',' Expression
            | STRING
            | /* empty */
            ;

%%

int main() {
    printf("Enter the expression:\n");
    yyparse();
    return 0;
}

Output and Error:

When I try to parse the following input:

for ( i = 0 ; i < 5 ; i++ ) { a = a + i; }

I get the following output:

Enter the expression:
for ( i = 0 ; i < 5 ; i ++ ) { a = a + i; }
(null) = 0
Error: syntax error

On running the following command:

bison -dv for2.y

i get:

for2.y: conflicts: 5 shift/reduce

My Question: I don't understand why the parser is throwing a syntax error. The for loop input looks valid, and I've already defined the necessary tokens and rules to handle for loops and increment expressions like i++.

I've tried various changes, but the error persists. Could someone point out what might be wrong with my grammar or Flex/Bison setup? Any help would be appreciated!


Solution

  • Because you have shift-reduce conflicts, the grammar cannot be recognized by an LALR(1) parser. As a result, bison will resolve the conflicts by choosing to shift (rather than reduce) in each conflicting state, resulting in a parser that recognizes a SUBSET of the language specified by the grammar.

    In order to see exactly what is happening, you need to examine the states with conflicts to see how the resolution of the conflicts affect how the parser recognizes things.

    With the code you've posted, when you look in for2.output, you'll see those 5 shift/reduce conflicts in State 77:

    State 77
    
        5 ForStatement: FOR '(' AssignmentExpression ';' LogicalExpression ';' IncrementExpression ')' . Block
    
        ID   shift, and go to state 1
        NUM  shift, and go to state 2
        FOR  shift, and go to state 3
        '('  shift, and go to state 4
        '{'  shift, and go to state 78
        '-'  shift, and go to state 5
    
        ID        [reduce using rule 8 (Block)]
        NUM       [reduce using rule 8 (Block)]
        FOR       [reduce using rule 8 (Block)]
        '('       [reduce using rule 8 (Block)]
        '-'       [reduce using rule 8 (Block)]
        $default  reduce using rule 8 (Block)
    

    which is noting the ambiguity that comes from have the rule Block: /* empty */ which does seem like an error in your grammar, but which should not cause the result you report. The resolution (which is mostly equivalent to simply deleting that rule) should work fine for this one case, though you might prefer changing it to Block: ';'.

    When I run your program, it works just fine:

    > ./for2
    Enter the expression:
    for ( i = 0 ; i < 5 ; i ++ ) { a = a + i; }
    Input accepted
    >
    

    which implies that what you posted is not exactly the same as what you are running.


    I can reproduce your result if I replace the Program: Statement rule with Program: StatementList (allowing one or more statements as the input rather than exiting after a single statement).

    In that case, it is the second line of input ((null) = 0;) causing the syntax error, as your AssignmentExpression rule only allows a single ID as the lvalue, not an lvalue expression. If you change that rule to

    AssignmentExpression
            : ArithmeticExpression '=' ArithmeticExpression
    

    then it works fine (with the added ; in the input)