I would like to store the names of all my hbase tables in an array inside my bash script.
sed
hotfixes are acceptable.readarray
it from some zookeeper file I am not aware of) are acceptableI have two hbase tables called MY_TABLE_NAME_1
and MY_TABLE_NAME_2
, so what I want would be:
tables = (
MY_TABLE_NAME_1
MY_TABLE_NAME_2
)
What I tried:
Basing on HBase Shell in OS Scripts by Cloudera:
echo "list" | /path/to/hbase/bin/hbase shell -n > /home/me/hbase-tables
readarray -t tables < /home/me/hbase-tables
but inside my /home/me/hbase-tables
is:
MY_TABLE_NAME_1
MY_TABLE_NAME_2
2 row(s) in 0.3310 seconds
MY_TABLE_NAME_1
MY_TABLE_NAME_2
You could use readarray
/mapfile
just fine. But to remove duplicates/skip empty lines and remove unnecessary strings, you need a filter using awk
.
Also you don't need to create a temporary file and then parse that file, but directly use a technique called process substitution which allows the output of a command be available as if it is available in a temporary file
mapfile -t output < <(echo "list" | /path/to/hbase/bin/hbase shell -n | awk '!unique[$0]++ && !/seconds/ && NF')
Now the array would contain only the unique table names from the hbase
output. That said, you should really look-up for the solution to remove the noise as part of the query output than post-process it this way.