We are writing a Linux bash shell script that receives arguments. We wanted to fail the script when it receives possible shell injections commands as parameters. I have added some commands below using regex. Can someone give me possible list of all such commands so that we can avoid threats
invalid_format="(^.*[;&|].*$)|(\brmdir\b)|(\bls\b)|(rm\s-)|(\bping\b)"
if [[ $LOCAL_DIR =~ $invalid_format ]]; then
echo "Error! LOCAL_DIR cannot contain command chaining characters like ; && || or possible shell injection commands"
exit 1
A blocklist of content that is explicitly disallowed in your data is just an invitation for someone to come up with a vulnerability that isn't on it, or to obfuscate their code so a regex can't match it, or to find an oddball syntax honored by your actual shell but not by the one the blocklist/validator was written for.
Don't fight that losing battle; instead, write code that's safe no matter what content your data contains, by never injecting data in a context where it could be evaluated and executed as code.
This is inherently unsafe:
eval "grep -e \"$1\" /var/log/*" ## DO NOT EVER DO THIS
eval "grep -e '$1' /var/log/*" ## DO NOT EVER DO THIS EITHER
sh -c "grep -e \"$1\" /var/log/*" ## DO NOT EVER DO THIS EITHER
sh -c "grep -e '$1' /var/log/*" ## DO NOT EVER DO THIS EITHER
ssh somehost "grep -e \"$1\" /var/log/*" ## DO NOT EVER DO THIS EITHER
ssh somehost "grep -e '$1' /var/log/*" ## DO NOT EVER DO THIS EITHER
In all of these cases, a user-provided value ($1
) is used in a context where it will be parsed by the shell as code. In all of those cases, a value could thus run arbitrary commands.
This is always safe:
grep -e "$1" /var/log/* ## ALWAYS DO THIS INSTEAD
Again, this is always safe. Even if there's something like $(rm -rf ~)'$(rm -rf ~)'
inside your $1
, the shell doesn't evaluate any of that content as syntax, so the values are inherently incapable of being parsed as code.
system()
or some equivalentThis is inherently unsafe:
system("grep -e \"" + input + "\" /var/log/*") /* DO NOT EVER DO THIS */
system("grep -e '" + input + "' /var/log/*") /* DO NOT EVER DO THIS EITHER */
This is inherently safe:
setenv("logs_to_grep", input); /* IF YOU MUST USE system(), DO THIS INSTEAD */
system("grep -e \"$logs_to_grep\" /var/log/*")
Note how we didn't put the value inside of the string passed to a shell at all, but passed it out-of-band, in an environment variable (using a lower-case name, so it couldn't overwrite any of the environment variables with security-sensitive meaning to the operating system and supporting tools).
Let's say you need to run a command with untrusted input over SSH. printf %q
can help:
printf -v args_q '%q ' "$@"
ssh somehost 'bash -s' <<EOF
command_with $args_q
EOF
Why the bash -s
? To ensure your args_str
is parsed by bash, as printf %q
does not guarantee POSIX-safe output.
Instead of using system()
or anything that invokes sh -c
, use language-level facilities that directly use the execve()
syscall to invoke your script. For example, in Python:
# BAD/EVIL/INSECURE
subprocess.Popen('yourscript ' + arg, shell=True) ## DO NOT EVER DO THIS
# GOOD/SECURE
subprocess.Popen(['yourscript', arg]) ## DO THIS INSTEAD.
xargs -I{} sh -c 'something_with {}'
-- because your placeholder, {}
, substitutes into a value parsed by sh
as code, it's parsed as code, not data. Don't do that.
Instead, pass your data out-of-band: xargs -d $'\n' sh -c 'for arg; do something_with "$arg"; done' _
(if your data is inherently incapable of containing newline literals; if you can't prove that to be true, use NUL delimiters and xargs -0
instead).
find . -type f -exec sh -c 'something_with {}' \;
-- same problem as with xargs
above, with the same solution: find . -exec sh -c 'for arg; do something_with "$arg"; done' _ {} +
Don't use eval
, or source
, or anything else that parses a non-constant string as code. Again, these values are all perfectly fine and safe inside your data; you simply shouldn't ever use them in your code.
Don't make assumptions about filenames, except those that your operating system enforces itself. Don't use ls
in scripts. Don't separate filenames with newlines -- use NULs instead.