Hello Amazing People,
I have been trying to achieve a goal of matching an extended glob pattern to a string array (file paths) and getting the list of file which matches to the pattern. Below is the code which works for one ext glob pattern but not for the other similar ext glob pattern.
This is the pattern which doesn't work.
pattern="**cars-+(!(bad-cats|dogs))/src/bin/out/**"
Here is the shell code:
#!/bin/bash
# Enable extended globbing
shopt -s extglob
# pattern="**cars-!(bad-cats)/src/bin/out/**" # this extended glob pattern works
pattern="**cars-+(!(bad-cats|dogs))/src/bin/out/**" # this extended glob pattern doesn't work and also goes in a very long loop cos of cars-ok-kk/src/main/out/start.txt
files="cars-white-pens/src/bin/out/file.txt,cars-ok-kk/src/main/out/start.txt,cars-grey-dogs/src/bin/out/bottle.txt,cars-bad-cats/src/bin/out/computer.txt,cars-whales/src/bin/mouse.txt,cars-dogs/src/bin/out/mouse.txt"
IFS=',' read -r -a files_array <<< "$files"
matching_files=""
for file in "${files_array[@]}"; do
if [[ $file == $pattern ]]; then
if [ -z "$matching_files" ]; then
matching_files="$file"
else
matching_files="$matching_files,$file"
fi
fi
done
echo "Match: $matching_files"
PS: We can change the ext glob pattern too if needed for the code to work but I want the pattern to be ext glob pattern only with support of exclusion of one or more directories as shown in the pattern.
Thanks in advance.
The commented ext glob pattern pattern="**cars-!(bad-cats)/src/bin/**"
works fine with the shell script but the other one doesn't.
When I run it, it takes some time (not sure why) and then prints below:
Match: cars-white-pens/src/bin/out/file.txt,cars-grey-dogs/src/bin/out/bottle.txt,cars-bad-cats/src/bin/out/computer.txt,cars-dogs/src/bin/out/mouse.txt
It removed src/main/out
and src/bin/mouse.txt
but failed to remove cars-bad-cats
and cars-dogs
file strings.
I expect output as Match: cars-white-pens/src/bin/out/file.txt,cars-grey-dogs/src/bin/out/bottle.txt
since src/main/out
and src/bin/mouse.txt
doesn't match and cars-bad-cats
and cars-dogs
is excluded.
+(!(bad-cats|dogs))
matches bad-cats
because it can (for instance) match it as strings bad
followed by -cats
, neither of which matches bad-cats
or dogs
(so the !(dogs|bad-cats)
pattern matches both of them). Similarly, dogs
can be matched as strings do
followed by gs
.
To fix the code change the pattern
assignment to:
pattern='*cars-!(dogs|bad-cats)/src/bin/out/*'
!(bad-cats|dogs)
does not match eitherbad-cats
or dogs
because it doesn't allow them to be broken into parts that can be matched separately.+(
and corresponding )
in the pattern also massively speeds up the matching because it eliminates huge amounts of pointless backtracking.**
in the original pattern with *
because the second *
is redundant, and might make pattern matching slower. (**
is only special when doing filename expansion and the globstar
shell option is enabled (see the globstar section in glob - Greg's Wiki).)