I'm just getting started with EMR Hadoop/spark etc., I am trying to use spark-shell to run a scala code to upload a file to EMRFS S3 location however I am receiving below error -
Without any Import If I run =>
val bucketName = "bucket"
val outputPath = "test.txt"
scala> val putRequest = PutObjectRequest.builder.bucket(bucketName).key(outputPath).build()
<console>:27: error: not found: value PutObjectRequest
val putRequest = PutObjectRequest.builder.bucket(bucketName).key(outputPath).build()
^
Once I add the Import package for PutObjectRequest I still get a different error.
scala> import com.amazonaws.services.s3.model.PutObjectRequest
import com.amazonaws.services.s3.model.PutObjectRequest
scala> val putRequest = PutObjectRequest.builder.bucket(bucketName).key(outputPath).build()
<console>:28: error: value builder is not a member of object com.amazonaws.services.s3.model.PutObjectRequest
val putRequest = PutObjectRequest.builder.bucket(bucketName).key(outputPath).build()
^
I'm not sure what I am missing. Any help would be appreciated!
Note: Spark version is 2.4.5
Instead of using the builder create the object of PutObjectRequest via a suitable constructor. Also, create a connection to S3 using AmazonS3ClientBuilder.
import com.amazonaws.regions.Regions
import com.amazonaws.services.s3.AmazonS3ClientBuilder
import com.amazonaws.services.s3.model.ObjectMetadata
import com.amazonaws.services.s3.model.PutObjectRequest
import java.io.File
val clientRegion = Regions.DEFAULT_REGION
val bucketName = "*** Bucket name ***"
val fileObjKeyName = "*** File object key name ***"
val fileName = "*** Path to file to upload ***"
val s3Client = AmazonS3ClientBuilder.standard.withRegion(clientRegion).build
// Upload a file as a new object with ContentType and title specified.
val request = new PutObjectRequest(bucketName, fileObjKeyName, new File(fileName))
val metadata = new ObjectMetadata()
metadata.setContentType("plain/text")
metadata.addUserMetadata("title", "someTitle")
request.setMetadata(metadata)
s3Client.putObject(request)