I have the following grammar:
S → a S b S | b S a S | ε
Since I'm trying to write a small compiler for it, I'd like to make it LL(1). I see that there seems to be a FIRST/FOLLOW conflict here, and I know I have to use substitution to resolve it, but I'm not exactly sure how to go about it. Here is my proposed grammar, but I'm not sure if it's correct:
S-> aSbT | epsilon
T-> bFaF| epsilon
F-> epsilon
Can someone help out?
In his original paper on LR parsing, Knuth gives the following grammar for this language, which he conjectures "is the briefest possible unambiguous grammar for this language:"
S → ε | aAbS | bBaS
A → ε | aAbA
B → ε | bBaB
Intuitively, this tries to break up any string of As and Bs into blocks that balance out completely. Some blocks start with a and end with b, while others start with b and end with a.
We can compute FIRST and FOLLOW sets as follows:
FIRST(S) = { ε, a, b }
FIRST(A) = { ε, a }
FIRST(B) = { ε, b }
FOLLOW(S) = { $ }
FOLLOW(A) = { b }
FOLLOW(B) = { a }
Based on this, we get the following LL(1) parse table:
| a | b | $
--+-------+-------+-------
S | aAbS | bBaS | e
A | aAbA | e |
B | e | bBaB |
And so this grammar is not only LR(1), but it's LL(1) as well.
Hope this helps!