regexbashregex-greedy

Bash regex ungreedy match


I have a regex pattern that is supposed to match at multiple places in a string. I want to get all the match groups into one array and then print every element.

So, I've been trying this:

#!/bin/bash

f=$'\n\tShare1   Disk\n\tShare2  Disk\n\tPrnt1  Printer'
regex=$'\n\t(.+?)\\s+Disk'
if [[ $f =~ $regex ]]
then
    for match in "${BASH_REMATCH[@]}"
    do
        echo "New match: $match"
    done
else
    echo "No matches"
fi

Result:

New match: 
    Share1   Disk
    Share2  Disk
New match: Share1   Disk
    Share2 

The expected result would have been

New match: Share1
New match: Share2

I think it doesn't work because my .+? is matching greedy. So I looked up how this could be accomplished with bash regex. But everyone seems to suggest to use grep with perl regex.

But surely there has to be another way. I was thinking maybe something like [^\\s]+.. But the output for that was:

New match: 
    Share1   Disk
New match: Share1

... Any ideas?


Solution

  • There are a couple of issues here. First, the first element of BASH_REMATCH is the entire string that matched the pattern, not the capture group, so you want to use ${BASH_REMATCH[@]:1} to get those things that were in the capture groups.

    However, bash regex doesn't support repeating the matches multiple times in the string, so bash probably isn't the right tool for this job. Since things are on their own lines though, you could try to use that to split things and apply the pattern to each line like:

    f=$'\n\tShare1   Disk\n\tShare2  Disk\n\tPrnt1  Printer'
    regex=$'\t(\S+?)\\s+Disk'
    while IFS=$'\n' read -r line; do
        if [[ $line =~ $regex ]]
        then
            printf 'New match: %s\n' "${BASH_REMATCH[@]:1}"
        else
            echo "No matches"
        fi
    done <<<"$f"