performancebash

Bash Script is super slow


I'm updating an old script to parse ARP data and get useful information out of it. We added a new router and while I can pull the ARP data out of the router it's in a new format. I've got a file "zTempMonth" which is a all the arp data from both sets of routers that I need to compile down into a new data format that's normalized. The below lines of code do what I need them to logically - but it's extremely slow - as in it will take days to run these loops where previously the script took 20-30 minutes. Is there a way to speed this up, or identify what's slowing it down?

Thank you in advance,

    echo "Parsing zTempMonth"
    while read LINE
    do
            wc=`echo $LINE | wc -w`
            if [[ $wc -eq "6" ]]; then
                    true
                    out=$(echo $LINE | awk '{ print $2 " " $4 " " $6}')
                    echo $out >> zTempMonth.tmp

            else
                    false
            fi

            if [[ $wc -eq "4" ]]; then
                    true
                    out=$(echo $LINE | awk '{ print $1 " " $3 " " $4}')
                    echo $out >> zTempMonth.tmp
            else
                    false
            fi


    done < zTempMonth

Solution

    1. While read loops are slow.
    2. Subshells in a loop are slow.
    3. >> (open(f, 'a')) calls in a loop are slow.

    You could speed this up and remain in pure bash, just by losing #2 and #3:

    #!/usr/bin/env bash
    
    while read -a line; do
        case "${#line[@]}" in
            6) printf '%s %s %s\n' "${line[1]}" "${line[3]}" "${line[5]}";;
            4) printf '%s %s %s\n' "${line[0]}" "${line[2]}" "${line[3]}";;
        esac
    done < zTempMonth >> zTempMonth.tmp
    

    But if there are more than a few lines, this will still be slower than pure awk. Consider an awk script as simple as this:

    BEGIN {
        print "Parsing zTempMonth"
    }   
    
    NF == 6 {
        print $2 " " $4 " " $6
    }   
    
    NF == 4 {
        print $1 " " $3 " " $4
    }   
    

    You could execute it like this:

    awk -f thatAwkScript zTempMonth >> zTempMonth.tmp
    

    to get the same append approach as your current script.