pythonparsinglalr

What is the shift/reduce conflict in this SLY program?


I'm porting a code from pyparsing to SLY, it's a lambda calc grammar. I have this rule

from sly import Lexer, Parser


class Lex(Lexer):
    tokens = {ID, LAMB, DOT, LPAR, RPAR}
    ID = r"[a-z]"
    LAMB = r"λ"
    DOT = r"\."
    LPAR = r"\("
    RPAR = r"\)"

    ignore = " \t"


class Par(Parser):
    """
    term : abst
         | lamb
         | ID
         | "(" term ")"
    abst | term ID
    lamb | "λ" ID "." term
    """
    tokens = Lex.tokens
    debugfile = 'lamb.out'


    @_("abst", "lamb", "ID")
    def term(self, p):
        return p[0]

    @_("LPAR term RPAR")
    def term(self, p):
        return p.term

    @_("term ID")
    def abst(self, p):
        return (p.term, p.ID)

    @_("LAMB ID DOT term")
    def lamb(self, p):
        return ("λ", p.ID, p.term)


def parse(input_: str):
    return Par().parse(Lex().tokenize(input_))


from pprint import pprint

pprint(parse("x y z"))
pprint(parse("λx.x"))
pprint(parse("(λx.λy.x y) a b"))

I know this grammar is wrong because it enforce an ID as last element of an application and this is not required for lambda calculus. But my problem is to understand the shift reduce conflict that is happening on last state, here is the debug output.

Grammar:

Rule 0     S' -> term
Rule 1     term -> LPAR term RPAR
Rule 2     term -> ID
Rule 3     term -> lamb
Rule 4     term -> abst
Rule 5     abst -> term ID
Rule 6     lamb -> LAMB ID DOT term

Terminals, with rules where they appear:

DOT                  : 6
ID                   : 2 5 6
LAMB                 : 6
LPAR                 : 1
RPAR                 : 1
error                : 

Nonterminals, with rules where they appear:

abst                 : 4
lamb                 : 3
term                 : 1 5 6 0


state 0

    (0) S' -> . term
    (1) term -> . LPAR term RPAR
    (2) term -> . ID
    (3) term -> . lamb
    (4) term -> . abst
    (6) lamb -> . LAMB ID DOT term
    (5) abst -> . term ID
    LPAR            shift and go to state 2
    ID              shift and go to state 3
    LAMB            shift and go to state 6

    term                           shift and go to state 1
    lamb                           shift and go to state 4
    abst                           shift and go to state 5

state 1

    (0) S' -> term .
    (5) abst -> term . ID
    ID              shift and go to state 7


state 2

    (1) term -> LPAR . term RPAR
    (1) term -> . LPAR term RPAR
    (2) term -> . ID
    (3) term -> . lamb
    (4) term -> . abst
    (6) lamb -> . LAMB ID DOT term
    (5) abst -> . term ID
    LPAR            shift and go to state 2
    ID              shift and go to state 3
    LAMB            shift and go to state 6

    term                           shift and go to state 8
    lamb                           shift and go to state 4
    abst                           shift and go to state 5

state 3

    (2) term -> ID .
    ID              reduce using rule 2 (term -> ID .)
    $end            reduce using rule 2 (term -> ID .)
    RPAR            reduce using rule 2 (term -> ID .)


state 4

    (3) term -> lamb .
    ID              reduce using rule 3 (term -> lamb .)
    $end            reduce using rule 3 (term -> lamb .)
    RPAR            reduce using rule 3 (term -> lamb .)


state 5

    (4) term -> abst .
    ID              reduce using rule 4 (term -> abst .)
    $end            reduce using rule 4 (term -> abst .)
    RPAR            reduce using rule 4 (term -> abst .)


state 6

    (6) lamb -> LAMB . ID DOT term
    ID              shift and go to state 9


state 7

    (5) abst -> term ID .
    ID              reduce using rule 5 (abst -> term ID .)
    $end            reduce using rule 5 (abst -> term ID .)
    RPAR            reduce using rule 5 (abst -> term ID .)


state 8

    (1) term -> LPAR term . RPAR
    (5) abst -> term . ID
    RPAR            shift and go to state 10
    ID              shift and go to state 7


state 9

    (6) lamb -> LAMB ID . DOT term
    DOT             shift and go to state 11


state 10

    (1) term -> LPAR term RPAR .
    ID              reduce using rule 1 (term -> LPAR term RPAR .)
    $end            reduce using rule 1 (term -> LPAR term RPAR .)
    RPAR            reduce using rule 1 (term -> LPAR term RPAR .)


state 11

    (6) lamb -> LAMB ID DOT . term
    (1) term -> . LPAR term RPAR
    (2) term -> . ID
    (3) term -> . lamb
    (4) term -> . abst
    (6) lamb -> . LAMB ID DOT term
    (5) abst -> . term ID
    LPAR            shift and go to state 2
    ID              shift and go to state 3
    LAMB            shift and go to state 6

    term                           shift and go to state 12
    lamb                           shift and go to state 4
    abst                           shift and go to state 5

state 12

    (6) lamb -> LAMB ID DOT term .
    (5) abst -> term . ID
  ! shift/reduce conflict for ID resolved as shift
    $end            reduce using rule 6 (lamb -> LAMB ID DOT term .)
    RPAR            reduce using rule 6 (lamb -> LAMB ID DOT term .)
    ID              shift and go to state 7


Conflicts:

shift/reduce conflict for ID in state 12 resolved as shift%   

I setup appl : term ID to encode left associativity of application the right rule would be appl : term term but this created an right association. My questions are:

  1. Why the shift/reduce is happening and how to solve it. To me since ID is an terminal it can only be shifted, there is no reduce for ID. What I'm missing?
  2. How to encode left associativity when no infix operator is envolved

Thanks @rici, I updated the grammar, I couldn't parse λx.λy.x so I added p term as the last rule with parenthesis. Here is my grammar now. It works as expected, but, I got 15 warnings, lol

WARNING: 4 shift/reduce conflicts
WARNING: 11 reduce/reduce conflicts
Parser debugging for Par written to lamb.out
(('x', 'y'), 'z')
('λ', 'x', 'x')
((('λ', 'x', ('λ', 'y', 'x')), 'a'), 'b')

Since it worked like I expect I think I ordered the rules right. I'm still struggling to get an intuition on writing correct grammars. The core is here

from sly import Lexer, Parser


class Lex(Lexer):
    tokens = {ID, LAMB, DOT, LPAR, RPAR}
    ID = r"[a-z]"
    LAMB = r"λ"
    DOT = r"\."
    LPAR = r"\("
    RPAR = r"\)"

    ignore = " \t"


class Par(Parser):
    """
    lamb : LAMB ID DOT term
         | appl
    appl : appl term
         | term
         | pterm
    pterm: LPAR term RPAR
    term : lamb
         | ID
    """
    tokens = Lex.tokens
    debugfile = 'lamb.out'

    @_("LAMB ID DOT term")
    def lamb(self, p):
        return ("λ", p.ID, p.term)

    @_("appl")
    def lamb(self, p):
        return p.appl

    @_("appl term")
    def appl(self, p):
        return (p.appl, p.term)

    @_("term")
    def appl(self, p):
        return p.term

    @_("pterm")
    def appl(self, p):
        return p.pterm

    @_("LPAR term RPAR")
    def pterm(self, p):
        return p.term

    @_("ID")
    def term(self, p):
        return p.ID

    @_("lamb")
    def term(self, p):
        return p.lamb


def parse(input_: str):
    return Par().parse(Lex().tokenize(input_))


from pprint import pprint

pprint(parse("x y z"))
pprint(parse("λx.x"))
pprint(parse("(λx.λy.x) a b"))

Here is the debug

Grammar:

Rule 0     S' -> lamb
Rule 1     lamb -> appl
Rule 2     lamb -> LAMB ID DOT term
Rule 3     appl -> pterm
Rule 4     appl -> term
Rule 5     appl -> appl term
Rule 6     pterm -> LPAR term RPAR
Rule 7     term -> lamb
Rule 8     term -> ID

Terminals, with rules where they appear:

DOT                  : 2
ID                   : 2 8
LAMB                 : 2
LPAR                 : 6
RPAR                 : 6
error                : 

Nonterminals, with rules where they appear:

appl                 : 1 5
lamb                 : 7 0
pterm                : 3
term                 : 2 4 5 6


state 0

    (0) S' -> . lamb
    (1) lamb -> . appl
    (2) lamb -> . LAMB ID DOT term
    (3) appl -> . pterm
    (4) appl -> . term
    (5) appl -> . appl term
    (6) pterm -> . LPAR term RPAR
    (7) term -> . lamb
    (8) term -> . ID
    LAMB            shift and go to state 3
    LPAR            shift and go to state 7
    ID              shift and go to state 4

    lamb                           shift and go to state 1
    appl                           shift and go to state 2
    term                           shift and go to state 5
    pterm                          shift and go to state 6

state 1

    (0) S' -> lamb .
    (7) term -> lamb .
  ! reduce/reduce conflict for $end resolved using rule 0 (S' -> lamb .)
    ID              reduce using rule 7 (term -> lamb .)
    LAMB            reduce using rule 7 (term -> lamb .)
    LPAR            reduce using rule 7 (term -> lamb .)


state 2

    (1) lamb -> appl .
    (5) appl -> appl . term
    (7) term -> . lamb
    (8) term -> . ID
    (1) lamb -> . appl
    (2) lamb -> . LAMB ID DOT term
    (3) appl -> . pterm
    (4) appl -> . term
    (5) appl -> . appl term
    (6) pterm -> . LPAR term RPAR
  ! shift/reduce conflict for ID resolved as shift
  ! shift/reduce conflict for LAMB resolved as shift
  ! shift/reduce conflict for LPAR resolved as shift
    $end            reduce using rule 1 (lamb -> appl .)
    RPAR            reduce using rule 1 (lamb -> appl .)
    ID              shift and go to state 4
    LAMB            shift and go to state 3
    LPAR            shift and go to state 7

    appl                           shift and go to state 2
    term                           shift and go to state 8
    lamb                           shift and go to state 9
    pterm                          shift and go to state 6

state 3

    (2) lamb -> LAMB . ID DOT term
    ID              shift and go to state 10


state 4

    (8) term -> ID .
    $end            reduce using rule 8 (term -> ID .)
    ID              reduce using rule 8 (term -> ID .)
    LAMB            reduce using rule 8 (term -> ID .)
    LPAR            reduce using rule 8 (term -> ID .)
    RPAR            reduce using rule 8 (term -> ID .)


state 5

    (4) appl -> term .
    $end            reduce using rule 4 (appl -> term .)
    ID              reduce using rule 4 (appl -> term .)
    LAMB            reduce using rule 4 (appl -> term .)
    LPAR            reduce using rule 4 (appl -> term .)


state 6

    (3) appl -> pterm .
    $end            reduce using rule 3 (appl -> pterm .)
    ID              reduce using rule 3 (appl -> pterm .)
    LAMB            reduce using rule 3 (appl -> pterm .)
    LPAR            reduce using rule 3 (appl -> pterm .)
    RPAR            reduce using rule 3 (appl -> pterm .)


state 7

    (6) pterm -> LPAR . term RPAR
    (7) term -> . lamb
    (8) term -> . ID
    (1) lamb -> . appl
    (2) lamb -> . LAMB ID DOT term
    (3) appl -> . pterm
    (4) appl -> . term
    (5) appl -> . appl term
    (6) pterm -> . LPAR term RPAR
    ID              shift and go to state 4
    LAMB            shift and go to state 3
    LPAR            shift and go to state 7

    term                           shift and go to state 11
    lamb                           shift and go to state 9
    appl                           shift and go to state 2
    pterm                          shift and go to state 6

state 8

    (5) appl -> appl term .
    (4) appl -> term .
  ! reduce/reduce conflict for $end resolved using rule 5 (appl -> appl term .)
  ! reduce/reduce conflict for ID resolved using rule 5 (appl -> appl term .)
  ! reduce/reduce conflict for LAMB resolved using rule 5 (appl -> appl term .)
  ! reduce/reduce conflict for LPAR resolved using rule 5 (appl -> appl term .)
  ! reduce/reduce conflict for RPAR resolved using rule 5 (appl -> appl term .)
    $end            reduce using rule 5 (appl -> appl term .)
    ID              reduce using rule 5 (appl -> appl term .)
    LAMB            reduce using rule 5 (appl -> appl term .)
    LPAR            reduce using rule 5 (appl -> appl term .)
    RPAR            reduce using rule 5 (appl -> appl term .)


state 9

    (7) term -> lamb .
    $end            reduce using rule 7 (term -> lamb .)
    ID              reduce using rule 7 (term -> lamb .)
    LAMB            reduce using rule 7 (term -> lamb .)
    LPAR            reduce using rule 7 (term -> lamb .)
    RPAR            reduce using rule 7 (term -> lamb .)


state 10

    (2) lamb -> LAMB ID . DOT term
    DOT             shift and go to state 12


state 11

    (6) pterm -> LPAR term . RPAR
    (4) appl -> term .
  ! shift/reduce conflict for RPAR resolved as shift
    RPAR            shift and go to state 13
    ID              reduce using rule 4 (appl -> term .)
    LAMB            reduce using rule 4 (appl -> term .)
    LPAR            reduce using rule 4 (appl -> term .)


state 12

    (2) lamb -> LAMB ID DOT . term
    (7) term -> . lamb
    (8) term -> . ID
    (1) lamb -> . appl
    (2) lamb -> . LAMB ID DOT term
    (3) appl -> . pterm
    (4) appl -> . term
    (5) appl -> . appl term
    (6) pterm -> . LPAR term RPAR
    ID              shift and go to state 4
    LAMB            shift and go to state 3
    LPAR            shift and go to state 7

    term                           shift and go to state 14
    lamb                           shift and go to state 9
    appl                           shift and go to state 2
    pterm                          shift and go to state 6

state 13

    (6) pterm -> LPAR term RPAR .
    $end            reduce using rule 6 (pterm -> LPAR term RPAR .)
    ID              reduce using rule 6 (pterm -> LPAR term RPAR .)
    LAMB            reduce using rule 6 (pterm -> LPAR term RPAR .)
    LPAR            reduce using rule 6 (pterm -> LPAR term RPAR .)
    RPAR            reduce using rule 6 (pterm -> LPAR term RPAR .)


state 14

    (2) lamb -> LAMB ID DOT term .
    (4) appl -> term .
  ! reduce/reduce conflict for $end resolved using rule 2 (lamb -> LAMB ID DOT term .)
  ! reduce/reduce conflict for ID resolved using rule 2 (lamb -> LAMB ID DOT term .)
  ! reduce/reduce conflict for LAMB resolved using rule 2 (lamb -> LAMB ID DOT term .)
  ! reduce/reduce conflict for LPAR resolved using rule 2 (lamb -> LAMB ID DOT term .)
  ! reduce/reduce conflict for RPAR resolved using rule 2 (lamb -> LAMB ID DOT term .)
    $end            reduce using rule 2 (lamb -> LAMB ID DOT term .)
    ID              reduce using rule 2 (lamb -> LAMB ID DOT term .)
    LAMB            reduce using rule 2 (lamb -> LAMB ID DOT term .)
    LPAR            reduce using rule 2 (lamb -> LAMB ID DOT term .)
    RPAR            reduce using rule 2 (lamb -> LAMB ID DOT term .)


Conflicts:

shift/reduce conflict for ID in state 2 resolved as shift
shift/reduce conflict for LAMB in state 2 resolved as shift
shift/reduce conflict for LPAR in state 2 resolved as shift
shift/reduce conflict for RPAR in state 11 resolved as shift
reduce/reduce conflict in state 1 resolved using rule S' -> lamb
rejected rule (term -> lamb) in state 1
reduce/reduce conflict in state 8 resolved using rule appl -> appl term
rejected rule (appl -> term) in state 8
reduce/reduce conflict in state 14 resolved using rule lamb -> LAMB ID DOT term
rejected rule (appl -> term) in state 14% 

Solution

  • lamb is one of the derivations of term. But the last thing in lamb is itself a term. And term ID is a term (via abst).

    So let's say we have processed λ x . <term>, where <term> is something which has already been reduced to term. (It might have been a simple ID, for example.) And let's say the next token is y, which is an ID. Since λ x . <term> is a term, we could understand that as an application of λ x . <term> to y. To do that, we have to reduce λ x . <term> to term in order to proceed with the abst. So that's the reduction. (Note: "conflict for ID" does not mean that ID could be reduced. ID is the lookahead symbol. The conflict is whether the lookahead should be shifted or whether the handle on the top of the stack -- which doesn't yet include the lookahead -- should be reduced.)

    The conflict is the result of the possibility that the grammar also allows ID to be shifted without the reduction. Shifting ID would allow the application of <term> to y (abst: term ID) as part of the lambda body.

    That's more of an operator precedence issue (between lamb and abst) than an associativity issue. I think you'd want to resolve in favour of the shift, making application higher precedence than lambda.

    (By the way, what you call lamb is actually a function abstraction, as I understand the concept. abst is an application. LISP syntax forces applications to be surrounded by parentheses, but I guess you're aiming at something closer to Haskell syntax with automatic currying in a sequence of terms. Just a guess, though.)

    You're correct that the grammar is deficient in not allowing arbitrary terms to be applied. That can be resolved in the "usual way", by using a cascade of non-terminals each of which represents a precedence level. So I guess you'd have something like

    lamb: LAMB ID DOT lamb   # Lowest precedence
        | appl               # cascade
    appl: appl term          # left-associative
        | term               # cascade
    term: '(' lamb ')'       # base
        | ID