rubyregexperlpcreoniguruma

Named subroutines in Oniguruma regex engine?


In Perl, you can do this:

(?x)
(?(DEFINE)
  (?<animal>dog|cat)
)
(?&animal)

In Ruby (Oniguruma engine), it seems that the (?(DEFINE... syntax is not supported. Also, (?&... becomes \g. So, you can do this:

(?x)
(?<animal>dog|cat)
\g<animal>

But of course, this is not equivalent to the Perl example I gave above, becuase the first (?<animal>dog|cat) is not ignored, since there isn't anything like (?(DEFINE....

If I want to define a large regex with a bunch of named subroutines, what I could once do in Perl can't be done this way.

It does seem that I could hack together a pretty awkward solution by doing something like this:

(?x)
(?:^$DEFINE
  (?<animal>dog|cat)
){0}
\g<animal>

But, that is pretty hackish. Is there a better way to do this? Does Oniguruma support a way to define named subroutines without having to try to "match" them first?

Alternatively, if there is a way to get true PCRE to work in Ruby, with ?(DEFINE... and (?&... I'd take that too.

Thanks!


Solution

  • You don't need a so complicated hack. Writing:

    (?x)
    (?<animal>dog|cat){0}
    (?<color>red|green|blue){0}
    ...
    your main pattern here
    

    does exactly the same.

    Putting all group definitions inside (?:^$DEFINE ... ){0} is only cosmetic.

    Note that a group with the quantifier {0} isn't tried at all (the quantifier is taken in account first), and if in this way the named group is defined anyway, man can deduce that it isn't really a hack, but the way to do it with oniguruma.