pythonparsingpretty-printminicondalark-parser

Python Lark parser: no versions I've installed seem to have the .pretty() print method


Problem:

# From example at https://github.com/lark-parser/lark/blob/master/examples/json_parser.py
from lark import Lark, Transformer, v_args
parse = json_parser.parse
json_grammar = r""" ... """
### Create the JSON parser with Lark, using the LALR algorithm
json_parser = Lark(json_grammar, parser='lalr',
                   # Using the standard lexer isn't required, and isn't usually recommended.
                   # But, it's good enough for JSON, and it's slightly faster.
                   lexer='standard',
                   # Disabling propagate_positions and placeholders slightly improves speed
                   propagate_positions=False,
                   maybe_placeholders=False,
                   # Using an internal transformer is faster and more memory efficient
                   transformer=TreeToJson())

with open(sys.argv[1]) as f:
    tree = parse(f.read())
    print( tree )
    # Errors next 2 lines:
    # No: tree.pretty( indent_str="  " )
    # No: Lark.pretty( indent_str="  " )

Specific Error:

Setup:

Python version = 3.8.1

In Miniconda 3 on Mac Bug Sur

conda install lark-parser

Installed 0.11.2-pyh44b312d_0

conda upgrade lark-parser

Installed 0.11.3-pyhd8ed1ab_0

Edit: Note about my Goal:

The goal here is NOT just to parse JSON; I just happen to be using a JSON example to try and learn. I want to write my own grammar for some data that I'm dealing with at work.

Edit: Why I Believe Pretty Print Should Exist:

Here's an example that uses the .pretty() function, and even includes output. But I can't seem to find anything (via conda at least) that includes .pretty(): http://github.com/lark-parser/lark/blob/master/docs/json_tutorial.md


Solution

  • I am not sure what I can put in this answer that is not already in the other answer. I will just try to create corresponding examples:

    json_parser = Lark(json_grammar, parser='lalr',
                       # Using the standard lexer isn't required, and isn't usually recommended.
                       # But, it's good enough for JSON, and it's slightly faster.
                       lexer='standard',
                       # Disabling propagate_positions and placeholders slightly improves speed
                       propagate_positions=False,
                       maybe_placeholders=False,
                       # Using an internal transformer is faster and more memory efficient
                       transformer=TreeToJson()
    )
    

    The important line here is the transformer=TreeToJson(). It tells lark to apply the Transformer class TreeToJson before returing the Tree to you. If you remove that line:

    json_parser = Lark(json_grammar, parser='lalr',
                       # Using the standard lexer isn't required, and isn't usually recommended.
                       # But, it's good enough for JSON, and it's slightly faster.
                       lexer='standard',
                       # Disabling propagate_positions and placeholders slightly improves speed
                       propagate_positions=False,
                       maybe_placeholders=False,
    )
    

    Then you get the Tree instance with the .pretty method:

    tree = json_parser.parse(test_json)
    print(tree.pretty())
    

    You can then apply the Transformer manually:

    res = TreeToJson().transform(tree)
    

    This is now a 'normal' python object, like you would get from the stdlib json module, so probably a dictonary.

    The transformer= option of the Lark construct makes it so that this is done before a Tree was ever created, saving time and memory.