I'm trying capture the some input regex in Bash but BASH_REMATCH comes EMPTY
#!/usr/bin/env /bin/bash
INPUT=$(cat input.txt)
TASK_NAME="MailAccountFetch"
MATCH_PATTERN="(${TASK_NAME})\s+([0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2})"
while read -r line; do
if [[ $line =~ $MATCH_PATTERN ]]; then
TASK_RESULT=${BASH_REMATCH[3]}
TASK_LAST_RUN=${BASH_REMATCH[2]}
TASK_EXECUTION_DURATION=${BASH_REMATCH[4]}
fi
done <<< "$INPUT"
My input is:
MailAccountFetch 2017-03-29 19:00:00 Success 5.0 Second(s) 2017-03-29 19:03:00
By debugging the script (VS Code+Bash ext) I can see the INPUT string matches as the code goes inside the IF but BASH_REMATCH is not populated with my two capture groups.
I'm on:
GNU bash, version 4.4.0(1)-release (x86_64-pc-linux-gnu)
What could be the issue?
LATER EDIT
Accepted Answer
Accepting most explanatory answer.
What finally resolved the issue:
bashdb/VS Code environment are causing the empty BASH_REMATCH. The code works OK when ran alone.
As Cyrus shows in his answer, a simplified version of your code - with the same input - does work on Linux in principle.
That said, your code references capture groups 3
and 4
, whereas your regex only defines 2.
In other words: ${BASH_REMATCH[3]}
and ${BASH_REMATCH[4]}
are empty by definition.
Note, however, that if =~
signals success, BASH_REMATCH
is never fully empty: at the very least - in the absence of any capture groups - ${BASH_REMATCH[0]}
will be defined.
There are some general points worth making:
Your shebang line reads #!/usr/bin/env /bin/bash
, which is effectively the same as #!/bin/bash
.
/usr/bin/env
is typically used if you want a version other than /bin/bash
to execute, one you've installed later and put in the PATH (too):
#!/usr/bin/env bash
ghoti points out that another reason for using #!/usr/bin/env bash
is to also support less common platforms such as FreeBSD, where bash
, if installed, is located in /usr/local/bin
rather than the usual /bin
.
In either scenario it is less predictable which bash
binary will be executed, because it depends on the effective $PATH
value at the time of invocation.
=~
is one of the few Bash features that are platform-dependent: it uses the particular regex dialect implemented by the platform's regex libraries.
\s
is a character class shortcut that is not available on all platforms, notably not on macOS; the POSIX-compliant equivalent is [[:space:]]
.
(In your particular case, \s
should work, however, because your Bash --version
output suggests that you are on a Linux distro.)
It's better not to use all-uppercase shell variable names such as INPUT
, so as to avoid conflicts with environment variables and special shell variables.