parsingcontext-free-grammarll-grammarlr-grammar

Difference between LL and LR Parsing


Currently studying context free grammars and methods for parsing them. From my understanding, context free grammars can be parsed via top down/LL or bottom up/LR. Is it correct understood that, LL parsers require grammars to have strictly unambiguous production rules before it can be parsed? And that LR parsers, on the other hand, also require grammars to be unambiguous, but instead of having to rewrite any ambiguous productions rules, additional precedence rules can added to the production rules to solve their ambiguity? But how does look ahead fit into all this?


Solution

  • From my understanding, context free grammars can be parsed via top down/LL or bottom up/LR.

    Yes, LL parsing works top-down. LR parsing is usually considered a bottom-up parsing approach, though some authors consider it to be a hybrid of top-down and bottom-up because it uses context about what can appear where in a generated parse tree.

    Is it correct understood that, LL parsers require grammars to have strictly unambiguous production rules before it can be parsed?

    LL parsers only work for unambiguous grammars. The most common classes of LL parsers (LL(1), LL(*)) do not work on all grammars and require some extra restrictions beyond that the grammar is unambiguous. For example, LL(1) parsers cannot handle left recursion.

    And that LR parsers, on the other hand, also require grammars to be unambiguous, but instead of having to rewrite any ambiguous productions rules, additional precedence rules can added to the production rules to solve their ambiguity?

    Yes, and no. It is true that, like LL parsers, the most common types of LR parsers (LR(0), SLR(1), LALR(1), LR(1), IELR(1)) require the grammar to be unambiguous. You are correct that many ambiguities can be resolved with precedence declarations that tiebreak otherwise ambiguous grammars, but this can't resolve all ambiguities. Additionally, there are some unambiguous grammars that cannot be parsed by any LR(k) parser.

    But how does look ahead fit into all this?

    Adding lookahead to an LL or LR parser gives the parser more context with which to decide which production rules to apply (for LL parsers) or whether to shift or reduce (LR parsers). Intuitively, being able to see further into the token sequence allows the parser to rule out some options that couldn't work because they couldn't match what comes next. The specific rules for how this lookahead works depends on the parsing algorithm; for example, LR(2) parsers have some nuances that don't show up in LR(1) parsers. You'll likely find the information you're looking for by specifically reading up on LL(1) parsing, LR(0) parsing, and LR(1) parsing and can use that as a launching point.