Following are two arrays of strings
arr1=("aa" "bb" "cc" "dd" "ee")
echo ${#arr1[@]} //output => 5
arr2=("cc" "dd" "ee" "ff")
echo ${#arr2[@]} //output => 4
The difference of the two arrays is arr_diff=("aa" "bb" "ff")
I can get the difference using the following and other methods from stackoverflow
arr_diff=$(echo ${arr1[@]} ${arr2[@]} | tr ' ' '\n' | sort | uniq -u)
OR
arr_diff=$(echo ${arr1[@]} ${arr2[@]} | xargs -n1 | sort | uniq -u)
echo ${arr_diff[@]} //output => (aa bb ff)
The point is not printing out the difference of the arrays, but getting the size of the difference array, so that I can validate if the two arrays have the same elements or not. However, if I try to query the size of the difference array, I get wrong answer.
echo ${#arr_diff[@]} //output => 1
I always get output as 1 irrespective of size of difference array (even when size is zero, i.e. both arr1 and arr2 have the same elements)
To get the different elements from 2 arrays you can use this awk
:
arr1=("aa" "bb" "cc" "dd" "ee")
arr2=("cc" "dd" "ee" "ff")
awk 'FNR == NR {
arr[$1]
next
}
{
if ($1 in arr)
delete arr[$1]
else
print $1
}
END {
for (i in arr)
print i
}' <(printf '%s\n' "${arr1[@]}") <(printf '%s\n' "${arr2[@]}")
ff
aa
bb
Now to get the difference in an array use:
read -ra diffarr < <(awk -v ORS=' ' 'FNR == NR {arr[$1]; next} {if ($1 in arr) delete arr[$1]; else print $1} END{for (i in arr) print i}' <(printf '%s\n' "${arr1[@]}") <(printf '%s\n' "${arr2[@]}"))
# check diffarr content
declare -p diffarr
declare -a diffarr=([0]="ff" [1]="aa" [2]="bb")
# print number of elements in diffarr
echo "${#diffarr[@]}"
3