regexpeg

It is possible to express in PEG something like /\s(foo|bar|baz)\s.*/


A regular expression like /\s(foo|bar|baz)\s.*/ would match the following string:

football bartender bazooka baz to the end
                          ^^^^^^^^^^^^^^^

Is it possible to make a Parsing Expression Grammar rules that would parse the string in a similar fashion, splitting it into a Head and Tail?

Result <- Head Tail

football bartender bazooka baz to the end
         Head             |    Tail

Solution

  • Yes, it's achievable using PEG. Here's an example using pegjs:

    start = f:words space r:tail
    {
       return [f, r];
    }
    
    tail = f:"baz" space r:words
    {
       return r;
    }
    
    words = f:word r:(space word)*
    {
       return [f].concat(r).flat().filter(n => n);
    }
    
    word = !tail w:$([A-Za-z]+)
    {
       return w;
    }
    
    space = " "
    {
       return;
    }
    

    Output:

    [
       [
          "football",
          "bartender",
          "bazooka"
       ],
       [
          "to",
          "the",
          "end"
       ]
    ]