parsingebnfpeggrako

PEG parsing match at least one preserving order


Given the PEG rule:

rule = element1:'abc' element2:'def' element3:'ghi' ;

How do I rewrite this such that it matches at least one of the elements but possibly all while enforcing their order?

I.e. I would like to match all of the following lines:

abc def ghi
abc def
abc     ghi
    def ghi
abc
    def
        ghi

but not an empty string or misordered expressions, e.g. def abc.

Of course with three elements, I could spell out the combinations in separate rules, but as the number of elements increases, this becomes error prone.

Is there a way to specify this in a concise manner?


Solution

  • You can use optionals:

    rule = [element1:'abc'] [element2:'def'] [element3:'ghi'] ;
    

    You would use a semantic action for rule to check that at least one token was matched:

    def rule(self, ast):
        if not (ast.element1 or ast.element2 or ast.element3):
            raise FailedSemantics('Expecting at least one token')
        return ast
    

    Another option is to use several choices:

    rule 
        = 
           element1:'abc' [element2:'def'] [element3:'ghi'] 
        | [element1:'abc']  element2:'def' [element3:'ghi'] 
        | [element1:'abc'] [element2:'def'] element3:'ghi' 
        ;
    

    Caching will make the later as efficient as the former.

    Then, you can add cut elements for additional efficiency and more meaningful error messages:

    rule 
        = 
           element1:'abc' ~  [element2:'def' ~] [element3:'ghi' ~] 
        | [element1:'abc' ~]  element2:'def' ~  [element3:'ghi' ~] 
        | [element1:'abc' ~] [element2:'def' ~] element3:'ghi'  ~
        ;
    

    or:

    rule = [element1:'abc' ~] [element2:'def' ~] [element3:'ghi' ~] ;