I'm porting a code from pyparsing to SLY, it's a lambda calc grammar. I have this rule
from sly import Lexer, Parser
class Lex(Lexer):
tokens = {ID, LAMB, DOT, LPAR, RPAR}
ID = r"[a-z]"
LAMB = r"λ"
DOT = r"\."
LPAR = r"\("
RPAR = r"\)"
ignore = " \t"
class Par(Parser):
"""
term : abst
| lamb
| ID
| "(" term ")"
abst | term ID
lamb | "λ" ID "." term
"""
tokens = Lex.tokens
debugfile = 'lamb.out'
@_("abst", "lamb", "ID")
def term(self, p):
return p[0]
@_("LPAR term RPAR")
def term(self, p):
return p.term
@_("term ID")
def abst(self, p):
return (p.term, p.ID)
@_("LAMB ID DOT term")
def lamb(self, p):
return ("λ", p.ID, p.term)
def parse(input_: str):
return Par().parse(Lex().tokenize(input_))
from pprint import pprint
pprint(parse("x y z"))
pprint(parse("λx.x"))
pprint(parse("(λx.λy.x y) a b"))
I know this grammar is wrong because it enforce an ID as last element of an application and this is not required for lambda calculus. But my problem is to understand the shift reduce conflict that is happening on last state, here is the debug output.
Grammar:
Rule 0 S' -> term
Rule 1 term -> LPAR term RPAR
Rule 2 term -> ID
Rule 3 term -> lamb
Rule 4 term -> abst
Rule 5 abst -> term ID
Rule 6 lamb -> LAMB ID DOT term
Terminals, with rules where they appear:
DOT : 6
ID : 2 5 6
LAMB : 6
LPAR : 1
RPAR : 1
error :
Nonterminals, with rules where they appear:
abst : 4
lamb : 3
term : 1 5 6 0
state 0
(0) S' -> . term
(1) term -> . LPAR term RPAR
(2) term -> . ID
(3) term -> . lamb
(4) term -> . abst
(6) lamb -> . LAMB ID DOT term
(5) abst -> . term ID
LPAR shift and go to state 2
ID shift and go to state 3
LAMB shift and go to state 6
term shift and go to state 1
lamb shift and go to state 4
abst shift and go to state 5
state 1
(0) S' -> term .
(5) abst -> term . ID
ID shift and go to state 7
state 2
(1) term -> LPAR . term RPAR
(1) term -> . LPAR term RPAR
(2) term -> . ID
(3) term -> . lamb
(4) term -> . abst
(6) lamb -> . LAMB ID DOT term
(5) abst -> . term ID
LPAR shift and go to state 2
ID shift and go to state 3
LAMB shift and go to state 6
term shift and go to state 8
lamb shift and go to state 4
abst shift and go to state 5
state 3
(2) term -> ID .
ID reduce using rule 2 (term -> ID .)
$end reduce using rule 2 (term -> ID .)
RPAR reduce using rule 2 (term -> ID .)
state 4
(3) term -> lamb .
ID reduce using rule 3 (term -> lamb .)
$end reduce using rule 3 (term -> lamb .)
RPAR reduce using rule 3 (term -> lamb .)
state 5
(4) term -> abst .
ID reduce using rule 4 (term -> abst .)
$end reduce using rule 4 (term -> abst .)
RPAR reduce using rule 4 (term -> abst .)
state 6
(6) lamb -> LAMB . ID DOT term
ID shift and go to state 9
state 7
(5) abst -> term ID .
ID reduce using rule 5 (abst -> term ID .)
$end reduce using rule 5 (abst -> term ID .)
RPAR reduce using rule 5 (abst -> term ID .)
state 8
(1) term -> LPAR term . RPAR
(5) abst -> term . ID
RPAR shift and go to state 10
ID shift and go to state 7
state 9
(6) lamb -> LAMB ID . DOT term
DOT shift and go to state 11
state 10
(1) term -> LPAR term RPAR .
ID reduce using rule 1 (term -> LPAR term RPAR .)
$end reduce using rule 1 (term -> LPAR term RPAR .)
RPAR reduce using rule 1 (term -> LPAR term RPAR .)
state 11
(6) lamb -> LAMB ID DOT . term
(1) term -> . LPAR term RPAR
(2) term -> . ID
(3) term -> . lamb
(4) term -> . abst
(6) lamb -> . LAMB ID DOT term
(5) abst -> . term ID
LPAR shift and go to state 2
ID shift and go to state 3
LAMB shift and go to state 6
term shift and go to state 12
lamb shift and go to state 4
abst shift and go to state 5
state 12
(6) lamb -> LAMB ID DOT term .
(5) abst -> term . ID
! shift/reduce conflict for ID resolved as shift
$end reduce using rule 6 (lamb -> LAMB ID DOT term .)
RPAR reduce using rule 6 (lamb -> LAMB ID DOT term .)
ID shift and go to state 7
Conflicts:
shift/reduce conflict for ID in state 12 resolved as shift%
I setup appl : term ID
to encode left associativity of application the right rule would be appl : term term
but this created an right association. My questions are:
Thanks @rici, I updated the grammar, I couldn't parse λx.λy.x
so I added p term as the last rule with parenthesis. Here is my grammar now. It works as expected, but, I got 15 warnings, lol
WARNING: 4 shift/reduce conflicts
WARNING: 11 reduce/reduce conflicts
Parser debugging for Par written to lamb.out
(('x', 'y'), 'z')
('λ', 'x', 'x')
((('λ', 'x', ('λ', 'y', 'x')), 'a'), 'b')
Since it worked like I expect I think I ordered the rules right. I'm still struggling to get an intuition on writing correct grammars. The core is here
from sly import Lexer, Parser
class Lex(Lexer):
tokens = {ID, LAMB, DOT, LPAR, RPAR}
ID = r"[a-z]"
LAMB = r"λ"
DOT = r"\."
LPAR = r"\("
RPAR = r"\)"
ignore = " \t"
class Par(Parser):
"""
lamb : LAMB ID DOT term
| appl
appl : appl term
| term
| pterm
pterm: LPAR term RPAR
term : lamb
| ID
"""
tokens = Lex.tokens
debugfile = 'lamb.out'
@_("LAMB ID DOT term")
def lamb(self, p):
return ("λ", p.ID, p.term)
@_("appl")
def lamb(self, p):
return p.appl
@_("appl term")
def appl(self, p):
return (p.appl, p.term)
@_("term")
def appl(self, p):
return p.term
@_("pterm")
def appl(self, p):
return p.pterm
@_("LPAR term RPAR")
def pterm(self, p):
return p.term
@_("ID")
def term(self, p):
return p.ID
@_("lamb")
def term(self, p):
return p.lamb
def parse(input_: str):
return Par().parse(Lex().tokenize(input_))
from pprint import pprint
pprint(parse("x y z"))
pprint(parse("λx.x"))
pprint(parse("(λx.λy.x) a b"))
Here is the debug
Grammar:
Rule 0 S' -> lamb
Rule 1 lamb -> appl
Rule 2 lamb -> LAMB ID DOT term
Rule 3 appl -> pterm
Rule 4 appl -> term
Rule 5 appl -> appl term
Rule 6 pterm -> LPAR term RPAR
Rule 7 term -> lamb
Rule 8 term -> ID
Terminals, with rules where they appear:
DOT : 2
ID : 2 8
LAMB : 2
LPAR : 6
RPAR : 6
error :
Nonterminals, with rules where they appear:
appl : 1 5
lamb : 7 0
pterm : 3
term : 2 4 5 6
state 0
(0) S' -> . lamb
(1) lamb -> . appl
(2) lamb -> . LAMB ID DOT term
(3) appl -> . pterm
(4) appl -> . term
(5) appl -> . appl term
(6) pterm -> . LPAR term RPAR
(7) term -> . lamb
(8) term -> . ID
LAMB shift and go to state 3
LPAR shift and go to state 7
ID shift and go to state 4
lamb shift and go to state 1
appl shift and go to state 2
term shift and go to state 5
pterm shift and go to state 6
state 1
(0) S' -> lamb .
(7) term -> lamb .
! reduce/reduce conflict for $end resolved using rule 0 (S' -> lamb .)
ID reduce using rule 7 (term -> lamb .)
LAMB reduce using rule 7 (term -> lamb .)
LPAR reduce using rule 7 (term -> lamb .)
state 2
(1) lamb -> appl .
(5) appl -> appl . term
(7) term -> . lamb
(8) term -> . ID
(1) lamb -> . appl
(2) lamb -> . LAMB ID DOT term
(3) appl -> . pterm
(4) appl -> . term
(5) appl -> . appl term
(6) pterm -> . LPAR term RPAR
! shift/reduce conflict for ID resolved as shift
! shift/reduce conflict for LAMB resolved as shift
! shift/reduce conflict for LPAR resolved as shift
$end reduce using rule 1 (lamb -> appl .)
RPAR reduce using rule 1 (lamb -> appl .)
ID shift and go to state 4
LAMB shift and go to state 3
LPAR shift and go to state 7
appl shift and go to state 2
term shift and go to state 8
lamb shift and go to state 9
pterm shift and go to state 6
state 3
(2) lamb -> LAMB . ID DOT term
ID shift and go to state 10
state 4
(8) term -> ID .
$end reduce using rule 8 (term -> ID .)
ID reduce using rule 8 (term -> ID .)
LAMB reduce using rule 8 (term -> ID .)
LPAR reduce using rule 8 (term -> ID .)
RPAR reduce using rule 8 (term -> ID .)
state 5
(4) appl -> term .
$end reduce using rule 4 (appl -> term .)
ID reduce using rule 4 (appl -> term .)
LAMB reduce using rule 4 (appl -> term .)
LPAR reduce using rule 4 (appl -> term .)
state 6
(3) appl -> pterm .
$end reduce using rule 3 (appl -> pterm .)
ID reduce using rule 3 (appl -> pterm .)
LAMB reduce using rule 3 (appl -> pterm .)
LPAR reduce using rule 3 (appl -> pterm .)
RPAR reduce using rule 3 (appl -> pterm .)
state 7
(6) pterm -> LPAR . term RPAR
(7) term -> . lamb
(8) term -> . ID
(1) lamb -> . appl
(2) lamb -> . LAMB ID DOT term
(3) appl -> . pterm
(4) appl -> . term
(5) appl -> . appl term
(6) pterm -> . LPAR term RPAR
ID shift and go to state 4
LAMB shift and go to state 3
LPAR shift and go to state 7
term shift and go to state 11
lamb shift and go to state 9
appl shift and go to state 2
pterm shift and go to state 6
state 8
(5) appl -> appl term .
(4) appl -> term .
! reduce/reduce conflict for $end resolved using rule 5 (appl -> appl term .)
! reduce/reduce conflict for ID resolved using rule 5 (appl -> appl term .)
! reduce/reduce conflict for LAMB resolved using rule 5 (appl -> appl term .)
! reduce/reduce conflict for LPAR resolved using rule 5 (appl -> appl term .)
! reduce/reduce conflict for RPAR resolved using rule 5 (appl -> appl term .)
$end reduce using rule 5 (appl -> appl term .)
ID reduce using rule 5 (appl -> appl term .)
LAMB reduce using rule 5 (appl -> appl term .)
LPAR reduce using rule 5 (appl -> appl term .)
RPAR reduce using rule 5 (appl -> appl term .)
state 9
(7) term -> lamb .
$end reduce using rule 7 (term -> lamb .)
ID reduce using rule 7 (term -> lamb .)
LAMB reduce using rule 7 (term -> lamb .)
LPAR reduce using rule 7 (term -> lamb .)
RPAR reduce using rule 7 (term -> lamb .)
state 10
(2) lamb -> LAMB ID . DOT term
DOT shift and go to state 12
state 11
(6) pterm -> LPAR term . RPAR
(4) appl -> term .
! shift/reduce conflict for RPAR resolved as shift
RPAR shift and go to state 13
ID reduce using rule 4 (appl -> term .)
LAMB reduce using rule 4 (appl -> term .)
LPAR reduce using rule 4 (appl -> term .)
state 12
(2) lamb -> LAMB ID DOT . term
(7) term -> . lamb
(8) term -> . ID
(1) lamb -> . appl
(2) lamb -> . LAMB ID DOT term
(3) appl -> . pterm
(4) appl -> . term
(5) appl -> . appl term
(6) pterm -> . LPAR term RPAR
ID shift and go to state 4
LAMB shift and go to state 3
LPAR shift and go to state 7
term shift and go to state 14
lamb shift and go to state 9
appl shift and go to state 2
pterm shift and go to state 6
state 13
(6) pterm -> LPAR term RPAR .
$end reduce using rule 6 (pterm -> LPAR term RPAR .)
ID reduce using rule 6 (pterm -> LPAR term RPAR .)
LAMB reduce using rule 6 (pterm -> LPAR term RPAR .)
LPAR reduce using rule 6 (pterm -> LPAR term RPAR .)
RPAR reduce using rule 6 (pterm -> LPAR term RPAR .)
state 14
(2) lamb -> LAMB ID DOT term .
(4) appl -> term .
! reduce/reduce conflict for $end resolved using rule 2 (lamb -> LAMB ID DOT term .)
! reduce/reduce conflict for ID resolved using rule 2 (lamb -> LAMB ID DOT term .)
! reduce/reduce conflict for LAMB resolved using rule 2 (lamb -> LAMB ID DOT term .)
! reduce/reduce conflict for LPAR resolved using rule 2 (lamb -> LAMB ID DOT term .)
! reduce/reduce conflict for RPAR resolved using rule 2 (lamb -> LAMB ID DOT term .)
$end reduce using rule 2 (lamb -> LAMB ID DOT term .)
ID reduce using rule 2 (lamb -> LAMB ID DOT term .)
LAMB reduce using rule 2 (lamb -> LAMB ID DOT term .)
LPAR reduce using rule 2 (lamb -> LAMB ID DOT term .)
RPAR reduce using rule 2 (lamb -> LAMB ID DOT term .)
Conflicts:
shift/reduce conflict for ID in state 2 resolved as shift
shift/reduce conflict for LAMB in state 2 resolved as shift
shift/reduce conflict for LPAR in state 2 resolved as shift
shift/reduce conflict for RPAR in state 11 resolved as shift
reduce/reduce conflict in state 1 resolved using rule S' -> lamb
rejected rule (term -> lamb) in state 1
reduce/reduce conflict in state 8 resolved using rule appl -> appl term
rejected rule (appl -> term) in state 8
reduce/reduce conflict in state 14 resolved using rule lamb -> LAMB ID DOT term
rejected rule (appl -> term) in state 14%
lamb
is one of the derivations of term
. But the last thing in lamb
is itself a term
. And term ID
is a term
(via abst
).
So let's say we have processed λ x . <term>
, where <term>
is something which has already been reduced to term
. (It might have been a simple ID
, for example.) And let's say the next token is y
, which is an ID
. Since λ x . <term>
is a term
, we could understand that as an application of λ x . <term>
to y
. To do that, we have to reduce λ x . <term>
to term
in order to proceed with the abst
. So that's the reduction. (Note: "conflict for ID" does not mean that ID
could be reduced. ID
is the lookahead symbol. The conflict is whether the lookahead should be shifted or whether the handle on the top of the stack -- which doesn't yet include the lookahead -- should be reduced.)
The conflict is the result of the possibility that the grammar also allows ID
to be shifted without the reduction. Shifting ID
would allow the application of <term>
to y
(abst: term ID
) as part of the lambda body.
That's more of an operator precedence issue (between lamb
and abst
) than an associativity issue. I think you'd want to resolve in favour of the shift, making application higher precedence than lambda.
(By the way, what you call lamb
is actually a function abstraction, as I understand the concept. abst
is an application. LISP syntax forces applications to be surrounded by parentheses, but I guess you're aiming at something closer to Haskell syntax with automatic currying in a sequence of terms. Just a guess, though.)
You're correct that the grammar is deficient in not allowing arbitrary terms to be applied. That can be resolved in the "usual way", by using a cascade of non-terminals each of which represents a precedence level. So I guess you'd have something like
lamb: LAMB ID DOT lamb # Lowest precedence
| appl # cascade
appl: appl term # left-associative
| term # cascade
term: '(' lamb ')' # base
| ID