pythonabstract-syntax-treelark-parser

Is there a way to make a Terminal match every NAME except for specific keywords?


I am using lark to parse some text and needed a way to match a NAME that did not have certain keywords in it. I have the keywords listed out in a terminal I am just not sure how to make the terminal I need using it.

Here is the way I formatted my keywords

keywords: "var"
        | "let"
        | "type"

All help on this is appreciated!


Solution

  • Lark has a built-in support for the concept of keywords. So, it is unlikely that you need to explicitly exclude keywords NAME.

    For example:

    l = Lark("""
        %import common (LETTER, DIGIT)
        NAME: LETTER (LETTER | DIGIT)*
        keywords: "var"
                | "let"
                | "type"
    
        start: NAME | keywords
    """, parser="lalr")
    
    print(l.parse("hello"))     # Tree('start', [Token('NAME', 'hello')])
    print(l.parse("let"))       # Tree('start', [Tree('keywords', [])])
    

    Having said that, if you must, you can accomplish this by using a regexp:

    l = Lark("""
        %import common (LETTER, DIGIT)
        NAME: /(?!(let|type|var))/ LETTER (LETTER | DIGIT)*
        start: NAME
    """)
    
    print(l.parse("hello"))     # Tree('start', [Token('NAME', 'hello')])
    print(l.parse("let"))       # Exception, terminal not defined
    

    P.S. keep in mind that "TERMINAL" is upper-case, and "rule" is lower-case, and they have behave differently in Lark, so it's important to keep the distinction in mind.