parsingcompiler-constructionstackgrammarlr-grammar

LR(1) - How do I know how many items to pop of the node stack when there are epsilon productions?


Suppose I have this simple grammar (labeled):

1 ||  S' -> A;
2 ||  A -> a B C D z;
3 ||  B -> b E;
4 ||  E -> e | ;
5 ||  C -> c | ;
6 ||  D -> d | ;

I can construct the LR(1) parsing table, but I don't fully understand how to parse it. You can take a look at the table here. Suppose the input is:

a b z

If you look at production #2, you'll match a, then in production #3, b, but then, at the end of production #3, there is a nullable production. The same applies to the end of production #2 (except for the z). When building the node stack, I have to pop off the number of symbols in the rhs of the production I am reducing, but how will I know how much when some of the symbols are nullable?


Solution

  • A symbol is a symbol and an empty right-hand side is empty. It's as simple as that.

    If you are reducing an empty right-hand side, there are zero symbols and so you pop zero things off the stack.

    If you are reducing a right-hand side with three symbols, there are three symbols and you need to pop three symbols off the stack (and associated states). It's irrelevant how many tokens each symbol on the right-hand side was associated with, because it has been reduced to one symbol, and that symbol was pushed onto the stack by the GOTO action.