I need to do an array search so many time in a bash process. I need to know what is the most quick and efficient way to do it. I know how to do it. The point of the question os how to do it in the fastest way. Now, I'm doing in this way:
#!/bin/bash
array_test=("text1" "text2" "text3" "text4" "text5")
text_to_search="text4"
START=$(date +%s.%N)
for item in "${array_test[@]}"; do
if [ ${item} = "${text_to_search}" ]; then
echo "found!!"
break
fi
done
END=$(date +%s.%N)
DIFF=$(echo "$END - $START" | bc)
echo $DIFF
With this code we can measure the time.
Imagine we have 300 items or more in array. Is there a quicker way? I need to increase performance. Thank you.
EDIT I'm using bash 4.2 . The real array has line breaks:
array_test=(
"text1"
"text2"
"text3"
"text4"
"text5"
)
Use grep -qFx
in array[*]
.
-q
suppresses output and exists on the first match.
-F
searches fixed strings instead of regexes.
-x
searches whole lines instead of substrings. This ensures that the search string b
doesn't match the element abc
.
if ( IFS=$'\n'; echo "${array[*]}" ) | grep -qFx "$text_to_search"; then
echo "found!"
fi
Assumption: Neither the array nor the text to search contain linebreaks.
It's also possible to search texts with linebreaks, as long as there is a known unused character which can be stored inside a bash variable. I found \x1E
(the ASCII control character »Record Separator«) to work out quite well. The adapted version looks as follows:
export d=$'\x1E' # unused character, here "Record Seperator"
if ( IFS="$d"; echo "$d${array[*]}$d" ) | grep -qF "$d$text_to_search$d"; then
echo "found!"
fi
In theory you might be able to speed this up even further by using multiple parallel grep
on slices of the array. However, this is so fast (see results below) that you probably won't ever encounter a scenario where the parallel search pays off.
I used an array of size 1'000'000, which was generated as follows:
size=$((10 ** 6))
array_test=($(seq -f 'text%.0f' 1 "$size"))
(By the way: using {1..1000000}
was magnitudes slower than seq
)
The search pattern was the last entry of said array
text_to_search="text$size"
Three search methods were tested
for
loopprintf %s\\n "array[@]" | grep -qFx
(IFS=$'\n'; echo "array[*]";) | grep -qFx
With the following results: