Page 224, Algorithm 4.31.
The method for constructing an LL(1) table says:
For each production
A -> alpha
of the grammar, do the following.
For each terminal
a
inFIRST(A)
, addA -> alpha
toM[A, a]
.If
eps
is inFIRST(alpha)
, then for each terminalb
inFOLLOW(A)
addA -> alpha
toM[A, b]
. Ifeps
is inFIRST(alpha)
and$
is inFOLLOW(A)
, addA -> alpha
toM[A, $]
as well.
They give the following grammar and the table.
(1) E -> T E'
(2) E' -> + T E'
(3) E' -> eps
(4) T -> F T'
(5) T' -> * F T'
(6) T' -> eps
(7) F -> ( E )
(8) F -> id
| id + * ( ) $
---+-----------------------------
E | 1 1
E' | 2 3 3
T | 4 4
T' | 6 5 6 6
F | 8 7
For reference, there are the FIRST
/FOLLOW
sets; taken from the book.
FIRST FOLLOW
E ( id ) $
E' + eps ) $
T ( id + ) $
T' * eps + ) $
F ( id + * ) $
When I run the algorithm (both by hand and using an implementation that I wrote based on the book), I get the following.
| id + * ( ) $
---+-----------------------------
E | 1 1
E' | 2|3 3 3
T | 4 4
T' | 5|6 5 6 6
F | 7|8 7|8
The difference is that I have some conflicts that the book avoids.
Here's my work, going through the algorithm. The first two steps are apparently the same as theirs, but we part ways on the third step. The other conflicts in my table appear for the same reason.
(1) E -> T E'
x
in FIRST(E) = { (, id }
, add (1)
to M[E, x]
. | id + * ( ) $
---+-----------------------------
E | 1 1
E' |
T |
T' |
F |
eps
is in FIRST(T E') = { (, id }
... it's not.(2) E' -> + T E'
x
in FIRST(E') = { +, eps }
, add (2)
to M[E', x]
. | id + * ( ) $
---+-----------------------------
E | 1 1
E' | 2
T |
T' |
F |
eps
is in FIRST(+ T E') = { + }
... it's not.(3) E' -> eps
x
in FIRST(E') = { +, eps }
, add (3)
to M[E', x]
. | id + * ( ) $
---+-----------------------------
E | 1 1
E' | 2|3
T |
T' |
F |
And here's the conflict.
The books says:
Since
FOLLOW(E') = { ), $ }
, production(3) E' -> eps
is added toM[E', )]
andM[E', $]
.
And we agree on this one: that's what I'd do next as well. But before that, the first step says:
- For each terminal
a
inFIRST(A)
, addA -> alpha
toM[A, a]
.
Applied to this rule, FIRST(E')
is { +, eps }
. So M[E', +]
should get (3) E' -> eps
.
You can see the same pattern in the next two rows. And in the last row, productions (7) F -> ( E )
and (8) F -> id
will both have the same FIRST(F) = { (, id }
, and thus M[F, id]
and M[F, (]
will both get (7)
and (8)
.
What's implied in the algorithm given in the book that I'm missing?
Also, here's a log of what my implementation gives. I have the same reasoning when I do the algorithm by hand, so please do not ask for code. I am looking for a mistake in my interpretation of the algorithm, not the implementation.
#########
## Log ##
#########
Looking at rule Expression -> Term Expression_.
Adding "Expression -> Term Expression_" to M[Expression, OpenParen] due to step 1.
Adding "Expression -> Term Expression_" to M[Expression, Identifier] due to step 1.
Looking at rule Expression_ -> Plus Term Expression_.
Adding "Expression_ -> Plus Term Expression_" to M[Expression_, Plus] due to step 1.
Looking at rule Expression_ -> ε.
Adding "Expression_ -> ε" to M[Expression_, Plus] due to step 1.
Adding "Expression_ -> ε" to M[Expression_, CloseParen] due to step 2a.
Adding "Expression_ -> ε" to M[Expression_, $] due to step 2b.
Looking at rule Term -> Factor Term_.
Adding "Term -> Factor Term_" to M[Term, OpenParen] due to step 1.
Adding "Term -> Factor Term_" to M[Term, Identifier] due to step 1.
Looking at rule Term_ -> Times Factor Term_.
Adding "Term_ -> Times Factor Term_" to M[Term_, Times] due to step 1.
Looking at rule Term_ -> ε.
Adding "Term_ -> ε" to M[Term_, Times] due to step 1.
Adding "Term_ -> ε" to M[Term_, Plus] due to step 2a.
Adding "Term_ -> ε" to M[Term_, CloseParen] due to step 2a.
Adding "Term_ -> ε" to M[Term_, $] due to step 2b.
Looking at rule Factor -> OpenParen Expression CloseParen.
Adding "Factor -> OpenParen Expression CloseParen" to M[Factor, OpenParen] due to step 1.
Adding "Factor -> OpenParen Expression CloseParen" to M[Factor, Identifier] due to step 1.
Looking at rule Factor -> Identifier.
Adding "Factor -> Identifier" to M[Factor, OpenParen] due to step 1.
Adding "Factor -> Identifier" to M[Factor, Identifier] due to step 1.
#################
## Final table ##
#################
Expression + OpenParen => { Expression -> Term Expression_ }
Expression + Identifier => { Expression -> Term Expression_ }
Expression_ + Plus => { Expression_ -> Plus Term Expression_ , Expression_ -> ε }
Expression_ + CloseParen => { Expression_ -> ε }
Expression_ + Symbol($) => { Expression_ -> ε }
Term + OpenParen => { Term -> Factor Term_ }
Term + Identifier => { Term -> Factor Term_ }
Term_ + Times => { Term_ -> Times Factor Term_ , Term_ -> ε }
Term_ + Plus => { Term_ -> ε }
Term_ + CloseParen => { Term_ -> ε }
Term_ + Symbol($) => { Term_ -> ε }
Factor + OpenParen => { Factor -> OpenParen Expression CloseParen , Factor -> Identifier }
Factor + Identifier => { Factor -> OpenParen Expression CloseParen , Factor -> Identifier }
This looks like an error in the book.
For each terminal a in FIRST(A), add A -> alpha to M[A, a].
There should be FIRST(alpha), not FIRST(A). Indeed, if you see F -> (E)
, you need to add this rule to M[F, '('], and if you see F -> id
, add this to M[F,id]. It makes no sense to add both rules to both states.
Here's the fixed algorithm from the book (emphasis on the changed part):
For each production
A -> alpha
of the grammar, do the following.
- For each terminal
a
inFIRST(alpha)
, addA -> alpha
toM[A, a]
.- If eps is in
FIRST(alpha)
, then for each terminalb
inFOLLOW(A)
addA -> alpha
toM[A, b]
. Ifeps
is inFIRST(alpha)
and$
is inFOLLOW(A)
, addA -> alpha
toM[A, $]
as well.
Here's the correct order for creating the table:
Looking at rule Expression -> Term Expression_.
Adding "Expression -> Term Expression_" to M[Expression, OpenParen] due to step 1.
Adding "Expression -> Term Expression_" to M[Expression, Identifier] due to step 1.
Looking at rule Expression_ -> Plus Term Expression_.
Adding "Expression_ -> Plus Term Expression_" to M[Expression_, Plus] due to step 1.
Looking at rule Expression_ -> ε.
Adding "Expression_ -> ε" to M[Expression_, CloseParen] due to step 2a.
Adding "Expression_ -> ε" to M[Expression_, $] due to step 2b.
Looking at rule Term -> Factor Term_.
Adding "Term -> Factor Term_" to M[Term, OpenParen] due to step 1.
Adding "Term -> Factor Term_" to M[Term, Identifier] due to step 1.
Looking at rule Term_ -> Times Factor Term_.
Adding "Term_ -> Times Factor Term_" to M[Term_, Times] due to step 1.
Looking at rule Term_ -> ε.
Adding "Term_ -> ε" to M[Term_, Plus] due to step 2a.
Adding "Term_ -> ε" to M[Term_, CloseParen] due to step 2a.
Adding "Term_ -> ε" to M[Term_, $] due to step 2b.
Looking at rule Factor -> OpenParen Expression CloseParen.
Adding "Factor -> OpenParen Expression CloseParen" to M[Factor, OpenParen] due to step 1.
Looking at rule Factor -> Identifier.
Adding "Factor -> Identifier" to M[Factor, Identifier] due to step 1.