hbaseamazon-emrgeomesa

Bootstrap Action after HBase Installation


ISSUE/QUESTION:
How can we assure that EMR Bootstrap action runs after the HBase application installation on EMR?

CLUSTER INFO:
I am using emr-5.25.0 version which has support for Hbase 1.4.9.

USE-CASE: I am installing Geomesa on EMR using the Bootstrap action (Following below document). https://www.geomesa.org/documentation/tutorials/geomesa-hbase-s3-on-aws.html

OBSERVATION:
I am using below code as bootstrap action. I see the below bootstrap action started before HBase installation on the cluster. I want to use bootstrap action to ensure Geomesa is installed on every master node in case of Multi-Master set-up.

#!/bin/bash

set -e -x

IS_MASTER=false

if [ -f /mnt/var/lib/info/instance.json ]
then
  IS_MASTER=`cat /mnt/var/lib/info/instance.json | tr -d '\n ' | sed -n 's|.*\"isMaster\":\([^,]*\).*|\1|p'`
fi

if [[ $IS_MASTER == false* ]] 
then
  echo "Not the master server."
  exit 0
else   
  echo "Installing Geomesa on Master Server."  
  GEOMESA_INSTALLATION_FILE_S3_LOCATION="$1"
  GEOMESA_FILE_VERSION="$2"

  # initialize the Geomesa version.
  export GEOMESA_VERSION="$3"

  # Create jars package
  mkdir -p /home/hadoop/jars

  # Copy Geomesa 2.3.0 jars from s3 to local jars folders.
  aws s3 cp $GEOMESA_INSTALLATION_FILE_S3_LOCATION /home/hadoop/jars

  # Move to opt package
  cd /opt/

  # Unzip geomesa jar in /opt package.
  sudo tar zxvf /home/hadoop/jars/geomesa-hbase-dist_${GEOMESA_FILE_VERSION}-bin.tar.gz

  # run bootstrap-geomesa-hbase-aws.sh file to bootstrap geomesa on EMR.
  sudo /opt/geomesa-hbase_${GEOMESA_FILE_VERSION}/bin/bootstrap-geomesa-hbase-aws.sh

  # Go to /etc/hadoop/conf
  cd /etc/hadoop/conf

  # Copy hbase-site.xml in the /etc/hadoop/conf
  sudo cp /usr/lib/hbase/conf/hbase-site.xml /etc/hadoop/conf

  # Create .zip file for hbase-site.xml
  sudo zip /home/hadoop/jars/hbase-site.zip hbase-site.xml

  # initialize GEOMESA_EXTRA_CLASSPATHS to hbase-site.zip
  export GEOMESA_EXTRA_CLASSPATHS=/home/hadoop/jars/hbase-site.zip
fi

Solution

  • Use Steps. The bootstrap always run after the server is provisioned and before installing the applications. So, you have to use Steps with your script. First, add the custom jar steps with below jars.

    s3://<region prefix>.elasticmapreduce/libs/script-runner/script-runner.jar
    

    The argument is

    s3://<your bucket>/<path>/<script>.sh
    

    and set the action on failure as Continue. Don't check the option

    Auto-terminate cluster after the last step is completed