bashshellbash-trap

Why is the exit code always 0 inside handle_exit and how to distinguish error from success?


I have a bash script where I want to do a pg_dumpall and upload it to S3 and then send an email to the admin if something went wrong with the exact error message and another email in case everything works fine.

#!/usr/bin/env bash

set -e
set -E
set -o pipefail
set -u
set -x

IFS=$'\n\t'

log="/tmp/error.txt"
exec 2>"$log"

handle_error() {    
    error_message="$(< "$log")"
    echo "$(caller): ${BASH_COMMAND}: ${error_message}"
    exit 1
}

handle_exit() {
    rm -rf ${backup_dirname}
    rm /tmp/error.txt
    echo "We are exiting $?"
}

trap "handle_exit" EXIT
trap "handle_error $?" ERR

backup_root="$HOME/Desktop/backups"
backup_dirname="$( date '+%Y_%m_%d_%HH_%MM_%SS' )"
backup_path="${backup_root}/${backup_dirname}"
encoding="UTF8"
globals_filename="globals.dump"
host="localhost"
port="5432"
username="abc"

mkdir -p "${backup_path}"
cd "${backup_root}"

pg_dumpall \
    --no-role-passwords \
    --no-password \
    --globals-only \
    --encoding="${encoding}" \
    --file="${backup_dirname}/${globals_filename}" \
    --host="${host}" \
    --port="${port}" \
    --username="${username}"

In my script above, when pg_dumpall fails for any reason, it calls handle_error and then it calls handle_exit here $? = 0

This is the output of a run with error

+ IFS='
        '
+ log=/tmp/error.txt
+ exec
50 ./scripts/test-local-backup.sh: pg_dumpall --no-role-passwords --no-password --globals-only --encoding="${encoding}" --file="${backup_dirname}/${globals_filename}" --host="${host}" --port="${port}" --username="${username}": + trap handle_exit EXIT
+ trap 'handle_error 0' ERR
+ backup_root=/Users/vr/Desktop/backups
++ date +%Y_%m_%d_%HH_%MM_%SS
+ backup_dirname=2023_07_28_15H_58M_12S
+ backup_path=/Users/vr/Desktop/backups/2023_07_28_15H_58M_12S
+ encoding=UTF8
+ globals_filename=globals.dump
+ host=localhost
+ port=5432
+ username=abc
+ mkdir -p /Users/vr/Desktop/backups/2023_07_28_15H_58M_12S
+ cd /Users/vr/Desktop/backups
+ pg_dumpall --no-role-passwords --no-password --globals-only --encoding=UTF8 --file=2023_07_28_15H_58M_12S/globals.dump --host=localhost --port=5432 --username=abc
pg_dumpall: error: connection to server at "localhost" (::1), port 5432 failed: FATAL:  role "abc" does not exist
++ handle_error 0
We are exiting 0

and this is what a successful run looks like

+ IFS='
        '
+ log=/tmp/error.txt
+ exec
We are exiting 0

Solution

  • handle_exit is called with $? = 0 on both conditions

    Because that's the trap code you set.

    trap "handle_error $?" ERR
    

    It uses double quotes so the string is evaluated at the time you set the ERR trap with the exit code of the previous command (successfully setting the exit trap in your case) so the ERR trap code is handle_error 0. You should use a tool like https://www.shellcheck.net/ wish can recognize such errors.


    Is there a better way to get the error message without piping to /tmp/error.txt

    What do you dislike with the current solution? I would suggest to use mktemp instead of a hardcoded file but apart from that I kinda like it.

    Also instead of

    error_message="$(< "$log")"
    echo "$(caller): ${BASH_COMMAND}: ${error_message}"
    

    you can simply

    echo -n "$(caller): ${BASH_COMMAND}: "
    cat "$log"
    

    I can send an email from handle_error for the failure case but what about success?

    Why not just send success at the end of the script. Or send both, success and error, in the exit trap.


    Also what happens if my email sending code generates an error inside handle_error?

    As far as I remember error trap is disabled while executing the error trap but I haven't found the source yet.

    Also you can always do stuff like command || true or

    {
       commands
       which
       might
       fail
    } || true
    

    or simply do set +e.


    To clarify how you should be able to pass exit codes around the traps here is a simplified example:

    #!/bin/bash
    
    trap 'handle_exit $?' EXIT
    trap 'handle_err $?' ERR
    
    handle_exit() {
      printf 'handle_exit: $? = %s   $1 = %s\n' $? $1
      exit $1
    }
    
    handle_err() {
      printf 'handle_err: $? = %s   $1 = %s\n' $? $1
      exit $(($1 + 1))
    }
    
    set -e
    false
    

    It would print

    handle_err: $? = 1   $1 = 1
    handle_exit: $? = 2   $1 = 2
    

    and the overall exit status is also 2.