regexgrepcapture-group

Named capture groups with grep


I use Unix grep. I would like to know how can I handle named capture groups with it.

Currently this is what I have:

echo "foobar" | grep -P "(?<q>.)ooba(?<w>.)"

So in theory, I have q=f and w=r, however I don't know how can I use these variables or hand them over to the next command (for example awk) via the pipeline.

In the end, I would like to have the following result:

f r

The above string is just an example. The capture groups could be anywhere, could be in any number, and printing could also be in any order. I'm saying this because I'm not specifically looking for a way to extract the last and the first character of a string, but rather an approach to extract as many variables as I want from a string. I know tricks like using -o, \K or (?<=some text).*?(?=some other text), but these only extract one portion of the string and not multiple.


Solution

  • There is a limitation of 9 captured groups in sed. However, this is not the case with gawk.

    From Question you mentioned,"but rather an approach to extract as many variables as I want from a string".

    sed is best for the job if you have to are playing with 1-9 groups. If this is not the case match function of gawk is also helpful. (Using same regex as Inian)

    echo "foobar" | awk '{match($0,/^(.)(.+)(.)$/,a);print a[1],a[3]}'
    f r
    

    PS: This is an alternate approach could be really helpful if dealing with groups more then 9. Also, for lesser number it work just fine. Also there are tightly coupled with awk's variables like NR,OFS ,FS so formatting is easier.