sedgrepcsh

Extract Regex Capture Group in Script


I am writing a CSH script and attempting to extract text from a source string given a key.

!/bin/csh -f
set source = "Smurfs\n\tPapa\nStar Trek\n\tRenegades\n\tStar Wars\n\tThe Empire Strikes Back\n"
set toFind = "Star Trek"
set regex = "$toFind[\s]*?(.*?)[\s]*?"
set match = `expr $source : $regex`
echo $match

The above code does not work, so I am missing something. I tried placing "Star Trek" directory inside rather than a variable. I should see Regenages as the answer. Had I put "Star Wars" as instead of "Star Trek", I should have seen The Empire Strikes Back.

Google search showed a possible solution using grep, such as

match = `grep -Po '<something>' <<< $source

I did not know what to put for <something>, nor am I an expert in grep.

In the real code, I am reading text from a file. I just simplified things here.

Thoughts?


Solution

  • The real solution uses a file for the source, so is:

    set valueCapture=`cat /mypath/filename | grep -A1 "${tofind}" | grep -v "${tofind}" | xargs`
    

    The code to find a capture value from a string should be (did not test it):

    set valueCapture=`cat $source | grep -A1 "${tofind}" | grep -v "${tofind}" | xargs`
    

    In both cases, the what I wish to find is:

    set tofind='asdf1@wxyz2'

    The xargs part trims off whitespace.