bashprofiling

How can I profile a Bash shell script slow startup?


My Bash shell takes up to 3-4 seconds to start up, while if I start it with --norc it runs immediately.

I started "profiling" /etc/bash.bashrc and ~/.bashrc by manually inserting return statements and seeking for speed improvements, but it is not a quantitative process and it is not efficient.

How can I profile my Bash scripts, and see which commands take the most time to execute?


Solution

  • If you have GNU date (or another version that can output nanoseconds), do this at the beginning of /etc/bash.bashrc (or wherever you'd like to begin a trace in any Bash script):

    PS4='+ $(date "+%s.%N")\011 '
    exec 3>&2 2>/tmp/bashstart.$$.log
    set -x
    

    add

    set +x
    exec 2>&3 3>&-
    

    at the end of ~/.bashrc (or at the end of the section of any Bash script you'd like tracing to stop). The \011 is an octal tab character.

    You should get a trace log in /tmp/bashstart.PID.log that shows the seconds.nanoseconds timestamp of each command that was executed. The difference from one time to the next is the amount of time that the intervening step took.

    As you narrow things down, you can move set -x later and set +x earlier (or bracket several sections of interest selectively).

    Although it's not as fine-grained as GNU date's nanoseconds, Bash 5 includes a variable which gives the time in microseconds. Using it saves you from spawning an external executable for every line and works on Macs or elsewhere that doesn't have GNU date - as long as you have Bash 5, of course. Change the setting of PS4:

    PS4='+ $EPOCHREALTIME\011 '
    

    As pointed out by @pawamoy, you can use BASH_XTRACEFD to send the output of the trace to a separate file descriptor if you have Bash 4.1 or later. From this answer:

    #!/bin/bash
    
    exec 5> command.txt
    BASH_XTRACEFD="5"
    
    echo -n "hello "
    
    set -x
    echo -n world
    set +x
    
    echo "!"
    

    This will cause the trace output to go to the file command.txt leaving stdout and stdout to be output normally (or be redirected separately).

    Making sense of the results

    We can see the time each command took, as well as a running total, using OpenOffice / Libre calc, making it easy to spot that in the example below, lesspipe is taking 6ms.

    color coded calc spreadsheet

    1. Copy-paste the resulting log file into a new spreadsheet.
    2. Insert a new column after the timestamp for the diff to the next timestamp; if you inserted column C, insert =B2-B1 in cell C1, then copy paste that cell to the whole C column.
    3. Instead of absolute timestamps, we can start things at 0 by copying the first timestamp, selecting the timestamp column, and Ctrl+Shift+V Special Paste > subtract.
    4. You can color values higher than X red to highlight long individual commands.