ISSUE/QUESTION:
How can we assure that EMR Bootstrap action runs after the HBase application installation on EMR?
CLUSTER INFO:
I am using emr-5.25.0 version which has support for Hbase 1.4.9.
USE-CASE: I am installing Geomesa on EMR using the Bootstrap action (Following below document). https://www.geomesa.org/documentation/tutorials/geomesa-hbase-s3-on-aws.html
OBSERVATION:
I am using below code as bootstrap action. I see the below bootstrap action started before HBase installation on the cluster. I want to use bootstrap action to ensure Geomesa is installed on every master node in case of Multi-Master set-up.
#!/bin/bash
set -e -x
IS_MASTER=false
if [ -f /mnt/var/lib/info/instance.json ]
then
IS_MASTER=`cat /mnt/var/lib/info/instance.json | tr -d '\n ' | sed -n 's|.*\"isMaster\":\([^,]*\).*|\1|p'`
fi
if [[ $IS_MASTER == false* ]]
then
echo "Not the master server."
exit 0
else
echo "Installing Geomesa on Master Server."
GEOMESA_INSTALLATION_FILE_S3_LOCATION="$1"
GEOMESA_FILE_VERSION="$2"
# initialize the Geomesa version.
export GEOMESA_VERSION="$3"
# Create jars package
mkdir -p /home/hadoop/jars
# Copy Geomesa 2.3.0 jars from s3 to local jars folders.
aws s3 cp $GEOMESA_INSTALLATION_FILE_S3_LOCATION /home/hadoop/jars
# Move to opt package
cd /opt/
# Unzip geomesa jar in /opt package.
sudo tar zxvf /home/hadoop/jars/geomesa-hbase-dist_${GEOMESA_FILE_VERSION}-bin.tar.gz
# run bootstrap-geomesa-hbase-aws.sh file to bootstrap geomesa on EMR.
sudo /opt/geomesa-hbase_${GEOMESA_FILE_VERSION}/bin/bootstrap-geomesa-hbase-aws.sh
# Go to /etc/hadoop/conf
cd /etc/hadoop/conf
# Copy hbase-site.xml in the /etc/hadoop/conf
sudo cp /usr/lib/hbase/conf/hbase-site.xml /etc/hadoop/conf
# Create .zip file for hbase-site.xml
sudo zip /home/hadoop/jars/hbase-site.zip hbase-site.xml
# initialize GEOMESA_EXTRA_CLASSPATHS to hbase-site.zip
export GEOMESA_EXTRA_CLASSPATHS=/home/hadoop/jars/hbase-site.zip
fi
Use Steps
. The bootstrap always run after the server is provisioned and before installing the applications. So, you have to use Steps with your script. First, add the custom jar steps with below jars.
s3://<region prefix>.elasticmapreduce/libs/script-runner/script-runner.jar
The argument is
s3://<your bucket>/<path>/<script>.sh
and set the action on failure as Continue
. Don't check the option
Auto-terminate cluster after the last step is completed