pythonpyparsing

Railroad diagrams in Pyparsing: How about Forward() declarations? Rule renaming?


I'm using pyparsing 3.0.9, python 3.9.16, and I'm trying to write a grammar for a (sub-)set of YAML. Not so much for the produced parser, as for the railroad diagrams. The actual state of the program is shown below.

Can someone point me in the right direction?

My code:

import pyparsing as pp

def make_parser():
    mapping = pp.Forward().set_name('mapping')
    label = pp.Word(pp.alphanums + '-_')
    true_false = pp.one_of('yes no true false').set_name('true_false')

    anchor = label.copy().set_name('anchor')
    tag    = label.copy().set_name('tag')
    alias  = label.copy().set_name('alias')

    key_value = (
        (pp.Keyword('yaml-scalar-event') +
            (pp.Keyword('yaml-scalar-event') ^ mapping))
    ).set_name('key_value')

    mapping = (
        pp.Keyword('yaml-mapping-start-event') +
        pp.ZeroOrMore(key_value) +
        pp.Keyword('yaml-mapping-end-event')
    )

    sequence = (
        anchor ^
        tag
    ).set_name('sequence')

    scalar = (
        alias ^
        tag ^
        ('plain_implicit' + true_false) ^
        ('quoted_implicit' + true_false) ^
        mapping
    ).set_name('scalar')

    node = (
        alias ^
        scalar ^
        sequence ^
        mapping
    ).set_name('node')

    document = (
        pp.Keyword('yaml-document-start-event') +
        pp.ZeroOrMore(node) +
        pp.Keyword('yaml-document-end-event')
    ).set_name('document')

    stream = (
        pp.Keyword('yaml-stream-start-event') +
        pp.ZeroOrMore(document) +
        pp.Keyword('yaml-stream-end-event')
    ).set_name('stream')

    return stream


def test_parser():
    parser = make_parser()

    parser.create_diagram('yaml_grammar.html',
        vertical = 2)



def main(args):
    parser = make_parser()
    parser.create_diagram('yaml_grammar.html', vertical = 2)


if __name__ == '__main__':
    import sys
    sys.exit(main(sys.argv))

Which produces the following output:

enter image description here


Solution

  • I love this! I agree, I like Michael Milton's addition of railroad diagramming to pyparsing and I've done some very similar work just to get a railroad diagram. Your question raised some interesting points about the railroad diagramming process, and I'm making a few tweaks to the pyparsing diagramming code to make the diagrams better.

    First off, here are some changes in your parser to get a clean diagram:

    def make_parser():
        """
        stream ::= STREAM-START document* STREAM-END
        document ::= DOCUMENT-START node DOCUMENT-END
        node ::= ALIAS | SCALAR | sequence | mapping
        sequence ::= SEQUENCE-START node* SEQUENCE-END
        mapping ::= MAPPING-START (node node)* MAPPING-END
        """
    
        # when I define Forwards, I try to go to the lowest possible
        # term in the BNF, in this case node
        # mapping = pp.Forward().set_name('mapping')
        node = pp.Forward().set_name("node")
        label = pp.Word(pp.alphanums + '-_')
        true_false = pp.one_of('yes no true false').set_name('true_false')
    
        anchor = label.copy().set_name('anchor')
        tag    = label.copy().set_name('tag')
        alias  = label.copy().set_name('alias')
    
        # add Group around key_value to keep from merging it with surrounding
        # terms in the diagram
        key_value = pp.Group(
            node + node
            # (pp.Keyword('yaml-scalar-event') +
            #     (pp.Keyword('yaml-scalar-event') ^ mapping))
        )#.set_name('key_value')
        # I suppressed the key_value naming because I liked the explict node-node
        # element in the diagram instead of the indirect key_value label.
    
        mapping = (
            # pyparsing will auto-promote strings to Literals, which should
            # be sufficient for your diagramming efforts, and less typing for you
            # (just so long as the string is immediately preceded or followed by
            # some kind of pyparsing ParserElement)
            # pp.Keyword('yaml-mapping-start-event') +
            'yaml-mapping-start-event' +
            # replaced ZeroOrMore usage with [...], purely a style choice
            # pp.ZeroOrMore(key_value) +
            key_value[...] +
            'yaml-mapping-end-event'
        ).set_name("mapping")
    
        sequence = (
            anchor ^
            tag
        ).set_name('sequence')
    
        scalar = pp.Group(
            # alias and mapping are already included in node
            # alias ^
            tag ^
            ('plain_implicit' + true_false) ^
            ('quoted_implicit' + true_false) #^
            # mapping
        ).set_name('scalar')
    
        # IMPORTANT!!! - be sure to use '<<=', not '=' when defining the expression
        # that needs to be parsed by a Forward.
        node <<= (
            alias ^
            scalar ^
            sequence ^
            mapping
        ).set_name('node')
    
        document = (
            'yaml-document-start-event' +
            node[...] +
            'yaml-document-end-event'
        ).set_name('document')
    
        stream = (
            'yaml-stream-start-event' +
            document[...] +
            'yaml-stream-end-event'
        ).set_name('stream')
    
        return stream
    

    My changes were:

    Using this code, to create the diagram:

        parser.create_diagram(
            'yaml_grammar.html',
            show_groups=False,
            vertical=2,
        )
    

    gives this diagram:

    pyparsing railroad diagram, showing a bug in pyparsing

    I didn't like a couple of things. For one, even though I set show_groups to False, we still see a grouping around the key-value nodes - a bug I have now fixed. Also, using the (2) repetition indicator feels clunky when the repetition is only 2 elements long, so I've special-cased repetition to only use this notation for 3 or more elements.

    With these fixes/changes (to be in the next pyparsing release), I now get this diagram, I hope it is close to your intended look (and I'm sorry to have taken so long to respond on this).

    improved pyparsing diagram