regexperlgrammarregexp-grammars

How do I best do balanced quoting with Perl's Regexp::Grammars?


Using Damian Conway's Regexp::Grammars, I'm trying to match different balanced quoting ('foo', "foo", but not 'foo") mechanisms -- such as parens, quotes, double quotes, and double dollars. This is the code I'm currently using.

<token: pair>        \'<literal>\'|\"<literal>\"|\$\$<literal>\$\$
<token: literal>    [\S]+

This generally works fine and allows me to say something like:

<rule: quote>            QUOTE <.as>? <pair>

My question is how do I reform the output, to exclude the needles notation for the pair token?

{
  '' => 'QUOTE AS \',\'',
  'quote' => {
               '' => 'QUOTE AS \',\'',
               'pair' => {
                           'literal' => ',',
                           '' => '\',\''
                         }
             }
},

Here, there is obviously no desire to have pair in between, quote, and the literal value of it. Is there a better way to match 'foo', "foo", and $$foo$$, and maybe sometimes ( foo ) without each time creating a needless pair token? Can I preprocess-out that token or fold it into the above? Or, write a better construct entirely that eliminates the need for it?


Solution

  • Per Damian, the answer was actually in the "Manual result distillation" part of the docs

    The correct answer is to tell your <pair> token
    to pass the result of each <literal> subrule through as its own
    result, using the MATCH=
    alias (see: "Manual result distillation" in the module documentation)  like so:
    
       <token: pair>        \'<MATCH=literal>\' | \"<MATCH=literal>\" |
    \$\$<MATCH=literal>\$\$
    

    Here is what the docs say:

    Regexp::Grammars also offers full manual control over the distillation process. If you use the reserved word MATCH as the alias for a subrule call [...] Note that, in this second case, even though and are captured to the result-hash, they are not returned, because the MATCH alias overrides the normal "return the result-hash" semantics and returns only what its associated subrule (i.e. ) produces.