scalaamazon-web-servicesapache-sparkhadoopamazon-s3

Can't connect from Spark to S3 - AmazonS3Exception Status Code: 400


I am trying to connect from Spark (running on my PC) to my S3 bucket:

 val spark = SparkSession
      .builder
      .appName("S3Client")
      .config("spark.master", "local")
      .getOrCreate()

val sc = spark.sparkContext;
    sc.hadoopConfiguration.set("fs.s3a.access.key", ACCESS_KEY)
    sc.hadoopConfiguration.set("fs.s3a.secret.key", SECRET_KEY)
    val txtFile = sc.textFile("s3a://bucket-name/folder/file.txt")
    val contents = txtFile.collect();

But getting the following exception:

Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 07A7BDC9135BCC84, AWS Error Code: null, AWS Error Message: Bad Request, S3 Extended Request ID: 6ly2vhZ2mAJdQl5UZ/QUdilFFN1hKhRzirw6h441oosGz+PLIvLW2fXsZ9xmd8cuBrNHCdh8UPE=

I have seen this question but it didn't help me.

Edit:

As Zack suggested, I added:

sc.hadoopConfiguration.set("fs.s3a.endpoint", "s3.eu-central-1.amazonaws.com")

But I still get the same exception.


Solution

  • I've solve the problem.

    I was targeting a region (Frankfurt) that required using version 4 of the signature.

    I've changed the region of the S3 bucket to Ireland and now it's working.