arraysbashvariablesfind

How can I store the "find" command results as an array in Bash


I am trying to save the result from find as arrays. Here is my code:

#!/bin/bash

echo "input : "
read input

echo "searching file with this pattern '${input}' under present directory"
array=`find . -name ${input}`

len=${#array[*]}
echo "found : ${len}"

i=0

while [ $i -lt $len ]
do
echo ${array[$i]}
let i++
done

I get 2 .txt files under current directory. So I expect '2' as result of ${len}. However, it prints 1. The reason is that it takes all result of find as one elements. How can I fix this?

P.S
I found several solutions on StackOverFlow about a similar problem. However, they are a little bit different so I can't apply in my case. I need to store the results in a variable before the loop. Thanks again.


Solution

  • Update 2020 for Linux Users:

    If you have an up-to-date version of bash (4.4-alpha or better), as you probably do if you are on Linux, then you should be using Benjamin W.'s answer.

    If you are on Mac OS, which —last I checked— still used bash 3.2, or are otherwise using an older bash, then continue on to the next section.

    Answer for bash 4.3 or earlier

    Here is one solution for getting the output of find into a bash array:

    array=()
    while IFS=  read -r -d $'\0'; do
        array+=("$REPLY")
    done < <(find . -name "${input}" -print0)
    

    This is tricky because, in general, file names can have spaces, new lines, and other script-hostile characters. The only way to use find and have the file names safely separated from each other is to use -print0 which prints the file names separated with a null character. This would not be much of an inconvenience if bash's readarray/mapfile functions supported null-separated strings but they don't. Bash's read does and that leads us to the loop above.

    [This answer was originally written in 2014. If you have a recent version of bash, please see the update below.]

    How it works

    1. The first line creates an empty array: array=()

    2. Every time that the read statement is executed, a null-separated file name is read from standard input. The -r option tells read to leave backslash characters alone. The -d $'\0' tells read that the input will be null-separated. Since we omit the name to read, the shell puts the input into the default name: REPLY.

    3. The array+=("$REPLY") statement appends the new file name to the array array.

    4. The final line combines redirection and command substitution to provide the output of find to the standard input of the while loop.

    Why use process substitution?

    If we didn't use process substitution, the loop could be written as:

    array=()
    find . -name "${input}" -print0 >tmpfile
    while IFS=  read -r -d $'\0'; do
        array+=("$REPLY")
    done <tmpfile
    rm -f tmpfile
    

    In the above the output of find is stored in a temporary file and that file is used as standard input to the while loop. The idea of process substitution is to make such temporary files unnecessary. So, instead of having the while loop get its stdin from tmpfile, we can have it get its stdin from <(find . -name ${input} -print0).

    Process substitution is widely useful. In many places where a command wants to read from a file, you can specify process substitution, <(...), instead of a file name. There is an analogous form, >(...), that can be used in place of a file name where the command wants to write to the file.

    Like arrays, process substitution is a feature of bash and other advanced shells. It is not part of the POSIX standard.

    Alternative: lastpipe

    If desired, lastpipe can be used instead of process substitution (hat tip: Caesar):

    set +m
    shopt -s lastpipe
    array=()
    find . -name "${input}" -print0 | while IFS=  read -r -d $'\0'; do array+=("$REPLY"); done; declare -p array
    

    shopt -s lastpipe tells bash to run the last command in the pipeline in the current shell (not the background). This way, the array remains in existence after the pipeline completes. Because lastpipe only takes effect if job control is turned off, we run set +m. (In a script, as opposed to the command line, job control is off by default.)

    Additional notes

    The following command creates a shell variable, not a shell array:

    array=`find . -name "${input}"`
    

    If you wanted to create an array, you would need to put parens around the output of find. So, naively, one could:

    array=(`find . -name "${input}"`)  # don't do this
    

    The problem is that the shell performs word splitting on the results of find so that the elements of the array are not guaranteed to be what you want.

    Update 2019

    Starting with version 4.4-alpha, bash now supports a -d option so that the above loop is no longer necessary. Instead, one can use:

    mapfile -d $'\0' array < <(find . -name "${input}" -print0)
    

    For more information on this, please see (and upvote) Benjamin W.'s answer.