I am using emr-6.12.0 and trying to set environment varibles which are stored in secret manager in bootstrap.sh file.
SECRET_NAME="/myapp/dev/secrets"
SECRETS_JSON=$(aws secretsmanager get-secret-value --secret-id $SECRET_NAME --query SecretString --output text)
# Parse the secrets and set them as environment variables
for key in $(echo "$SECRETS_JSON" | jq -r "keys[]"); do
value=$(echo "$SECRETS_JSON" | jq -r ".$key // empty" | sed 's/"/\\"/g')
echo "$value"
if [ ! -z "$value" ]; then
export "$key"="$value"
fi
done
I am able to see these values in log.
but when I try to access these variables from my pyspark script, I am not able to get these env variables.
os.environ.get("POSTGRES_URL") // Returns None
for key, value in os.environ.items():
self.logger.info(f"{key}: {value}") // not able to see my env variables
As I am new to EMR and spark, please help me to know how can I set my env variables from SSM to EMR.
In order to retrieve secrets from amazon secret manager in your python application, you'll need to follow the following steps:
pip install aws-secretsmanager-caching
After that, in your app.py, you'll have something like this:
import botocore
import botocore.session
from aws_secretsmanager_caching import SecretCache, SecretCacheConfig
client = botocore.session.get_session().create_client('secretsmanager')
cache_config = SecretCacheConfig()
cache = SecretCache( config = cache_config, client = client)
secret = cache.get_secret_string('mysecret')
NB: You must have the following:
Required permission:
Official doc