This is the skeleton of my bottom-up parser:
while (!stack.empty())
{
if (!reduce())
{
shift();
}
}
And I have these rules:
Program -> Expr
Expr -> Expr '+' Expr
Expr -> Number
Number -> FLOAT | INTEGER // These 2 are terminal symbols
If I have the following input:
2 + 3
2 gets pushed onto the stack, then gets reduced to a Number, then an Expression and then a Program. So it doesn't have any chance to parse the whole addition. How can I force the parser to parse the rest too? Should I do something like:
Program -> Expr EOF
?
Bottom-up parsing is pretty new for me so any help is appreciated.
You can use a look-ahead to decide whether to shift or reduce. Your example grammar fits in the LR(1) family of grammars, so a bottomup parser with a 1 symbol look-ahead should be able to capture it.
In your example you have input:
2 + 3
So you build up a stack:
Program, Expr, Number
Shift FLOAT
, reduce Number
, reduce Expr
. Now you have a choice, whether to reduce Program
or shift '+'
, so you look ahead is there is a '+'
. If so you shift and follow the Expr = Expr '+' Expr
rule.
You may still want to do Program = Expr EOF
so your lookahead can always return EOF
if there's nothing left to parse.