Using Damian Conway's Regexp::Grammars, I'm trying to match different balanced quoting ('foo'
, "foo"
, but not 'foo"
) mechanisms -- such as parens, quotes, double quotes, and double dollars. This is the code I'm currently using.
<token: pair> \'<literal>\'|\"<literal>\"|\$\$<literal>\$\$
<token: literal> [\S]+
This generally works fine and allows me to say something like:
<rule: quote> QUOTE <.as>? <pair>
My question is how do I reform the output, to exclude the needles notation for the pair
token?
{
'' => 'QUOTE AS \',\'',
'quote' => {
'' => 'QUOTE AS \',\'',
'pair' => {
'literal' => ',',
'' => '\',\''
}
}
},
Here, there is obviously no desire to have pair
in between, quote, and the literal
value of it. Is there a better way to match 'foo'
, "foo"
, and $$foo$$
, and maybe sometimes ( foo )
without each time creating a needless pair
token? Can I preprocess-out that token or fold it into the above? Or, write a better construct entirely that eliminates the need for it?
Per Damian, the answer was actually in the "Manual result distillation" part of the docs
The correct answer is to tell your <pair> token
to pass the result of each <literal> subrule through as its own
result, using the MATCH=
alias (see: "Manual result distillation" in the module documentation) like so:
<token: pair> \'<MATCH=literal>\' | \"<MATCH=literal>\" |
\$\$<MATCH=literal>\$\$
Here is what the docs say:
Regexp::Grammars also offers full manual control over the distillation process. If you use the reserved word MATCH as the alias for a subrule call [...] Note that, in this second case, even though and are captured to the result-hash, they are not returned, because the MATCH alias overrides the normal "return the result-hash" semantics and returns only what its associated subrule (i.e. ) produces.