For copying the file from HDFS to S3 bucket I used the command
hadoop distcp -Dfs.s3a.access.key=ACCESS_KEY_HERE\
-Dfs.s3a.secret.key=SECRET_KEY_HERE /path/in/hdfs s3a:/BUCKET NAME
But the access key and sectet key are visible here which are not secure . Is there any method to provide credentials from file . I dont want to edit config file ,which is one of the method I came across .
Recent (2.8+) versions let you hide your credentials in a jceks file; there's some documentation on the Hadoop s3 page there. That way: no need to put any secrets on the command line at all; you just share them across the cluster and then, in the distcp command, set hadoop.security.credential.provider.path
to the path, like jceks://hdfs@nn1.example.com:9001/user/backup/s3.jceks
Fan: if you are running in EC2, the IAM role credentials should be automatically picked up from the default chain of credential providers: after looking for the config options & env vars, it tries a GET of the EC2 http endpoint which serves up the session credentials. If that's not happening, make sure that com.amazonaws.auth.InstanceProfileCredentialsProvider
is on the list of credential providers. It's a bit slower than the others (and can get throttled), so best to put near the end.