In my PySpark project I'm using a python package that uses Dynaconf
so I need to set the following environment variable - ENV_FOR_DYNACONF = platform
.
The problem is I don't understand how can I pass this environment variable to the EMR Serverless job run.
I've tried this -
os.environ['ENV_FOR_DYNACONF'] = platform
At the beginning of the code, but it didn't work and in any case, I want to understand what is the right way to pass env variables to the EMR.
Can anyone help?
To pass environment variables in EMR Serverless, you can use the following spark job properties in sparkSubmitParameters
:
spark.executorEnv.[KEY]
to pass env variables to the executors.spark.emr-serverless.driverEnv.[KEY]
to pass env variables to the driver.Example :
"sparkSubmitParameters": "--conf spark.emr-serverless.driverEnv.ENV_FOR_DYNACONF=platform"
For more information, refer to https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/jobs-spark.html