I'm currently using happy
to parse a language, but I don't think the parser is relevant, except to say it's an LALR parser. Here's a small excerpt from the grammar:
ArithExpr -> ArithExpr + ArithExpr
ArithExpr -> ( ArithExpr )
ArithExpr -> ...
BoolExpr -> ArithExpr == ArithExpr
BoolExpr -> ( BoolExpr )
BoolExpr -> ...
The problem is that I'm getting reduce-reduce conflicts. I think the issue arises when I try to parse something like the following:
( ( 2 + 3 ) == ( 4 + 5 ) )
There's only one way to parse this expression, but the problem is I think even at the first parenthesis the parser starts having some trouble. The reason I think this is the case is that the parser does not know whether it's facing a ArithExpr
or a BoolExpr
in the future, and as it's only got one token of lookahead it has to make an arbitrary choice which may be wrong.
Is there anyway to rewrite the grammar to accept this language? Or should I really just be parsing both ArithExpr
and BoolExpr
just as one uniform Expr
and dealing with the actual types during type checking?
You should just parse Expr
and do the type checking during semantic analysis. Otherwise, you will have really a hard time dealing with either parenthesized expressions (you can't tell what type they are until too late) or first-class boolean values (a variable might have a boolean value, no?).
See my answer here for an alternative (but it ends up giving the same advice); I provide the link for completeness only because I'm really not convinced of the value of the techniques described in that answer, but I think it is essentially the same question with a different LALR parser generator.