I am trying to run an AWS GLUE job locally from a docker container and I am getting the following error:
File "/glue/script.py", line 19, in <module>
job.init(args['JOB_NAME'], args)
File "/glue/aws-glue-libs/PyGlue.zip/awsglue/job.py", line 38, in init
File "/glue/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "/glue/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
File "/glue/spark-2.4.3-bin-spark-2.4.3-bin-hadoop2.8/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:com.amazonaws.services.glue.util.Job.init.
: com.amazonaws.SdkClientException: Unable to load region information from any provider in the chain
It appears that its unable to find the region but I have stored my config and credentials files in the usual path within the container so it should be able to find it from there. Or should I be trying to declare the region from within the script file?
Here are the first few lines of the job, it currently fails on the last line:
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from pyspark.sql.functions import *
from awsglue.dynamicframe import DynamicFrame## @type: DataSource
import datetime
import boto3
## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
I have tried running the glue jobs locally from a Docker container and it worked well for me.
I have written a blog around the same and the docker image is also avaliable on dockerhub. Not very sure of this error but if you want to use the image I am providing the link to the same
Article: https://towardsdatascience.com/develop-glue-jobs-locally-using-docker-containers-bffc9d95bd1
Github: https://github.com/jnshubham/aws-glue-local-etl-docker
I don't face region issue using this, check if this helps you.