parsingcompiler-constructiongrammarll-grammarlanguage-theory

Why can't a left-recursive, non-deterministic, or ambiguous grammar be LL(1)?


I've learned from several sources that an LL(1) grammar is:

  1. unambiguous,
  2. not left-recursive,
  3. and, deterministic (left-factorized).

What I can't fully understand is why the above is true for any LL(1) grammar. I know the LL(1) parsing table will have multiple entries at some cells, but what I really want to get is a formal and general (not with an example) proof of the following proposition(s):

A left-recursive (1), non-deterministic (2), or ambiguous (3) grammar is not LL(1).


Solution

  • I have done some more research, and I think I've found a solution for the 1st and 2nd questions, as for the 3rd one, I found an existing solution here on SO for it, the proof attempts are written below:

    We start by writing the three rules of the definition of an LL(1) grammar:

    For every production A -> α | β with α ≠ β:

    1. FIRST(α) ∩ FIRST(β) = Ø.
    2. If β =>* ε then FIRST(α) ∩ FOLLOW(A) = Ø (also, if α =>* ε then FIRST(β) ∩ FOLLOW(A) = Ø).
    3. Including ε in rule (1) implies that at most one of α and β can derive ε.

    Proposition 1: A non-factored grammar is not LL(1).

    Proof:

    If a grammar G is non-factored then there exists a production in G of the form:

    A -> ωα1 | ωα2 | ... | ωαn
    

    (where αi is the i-th α, not the symbols α and i), with α1 ≠ α2 ≠ ... ≠ αn. We can then easily show that:

    ∩(i=1,..,n) FIRST(ωαi) ≠ Ø
    

    which contradicts rule (1) of the definition, thus, a non-factored grammar is not LL(1). ∎

    Proposition 2: A left-recursive grammar is not LL(1).

    Proof:

    If a grammar is left-recursive then there exists a production in G of the form:

    S -> Sα | β
    

    Three cases arise here:

    1. If FIRST(β) ≠ {ε} then:

          FIRST(β) ⊆ FIRST(S)

      =>  FIRST(β) ∩ FIRST(Sα) ≠ Ø

      which contradicts rule (1) of the definition.

    2. If FIRST(β) = {ε} then:

      2.1. If ε ∈ FIRST(α) then:

      ε ∈ FIRST(Sα)

      which contradicts rule (3) of the definition.

      2.2. If ε ∉ FIRST(α) then:

          FIRST(α) ⊆ FIRST(S) (because β =>* ε)

      =>  FIRST(α) ⊆ FIRST(Sα) ........ (I)

      we also know that:

      FIRST(α) ⊆ FOLLOW(S) ........ (II)

      by (I) and (II), we have:

      FIRST(Sα) ∩ FOLLOW(S) ≠ Ø

      and since β =>* ε, this contradicts rule (2) of the definition.

    In every case we arrive at a contradiction, hence, a left-recursive grammar is not LL(1). ∎

    Proposition 3: An ambiguous grammar is not LL(1).

    Proof:

    While the above proofs are mine, this one is not, it's by Kevin A. Naudé which I got from his answer that is linked below:

    https://stackoverflow.com/a/18969767/6275103