When I run a simple select * query on AWS Athena I get an access denied error.
The query is:
select * from sensor.sensordata
The Schema is:
CREATE EXTERNAL TABLE sensor.sensordata (
sig string,
`data` struct<`iat`:timestamp,
`sub`:string,
tMax: float,
tMin: float,
`tAvg`: float,
`hAvg`: float,
hMin: float,
hMax: float
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://mybucket/data/';
The error I get (IDs shortened) is that a file can not be read:
com.amazonaws.services.s3.model.AmazonS3Exception:
Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied;
Request ID: B0048904...; S3 Extended Request ID: CKchfW8...), S3 Extended Request ID:
CKchfW8... (Path: s3://mybucket/data/sensor=01235EFD886C7DF1EE/t=1561513414.json)
However I even made the bucket policy public for everyone:
{
"Version": "2008-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": "*",
"Action": [
"s3:PutObject",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::mybucket/*",
"arn:aws:s3:::mybucket"
]
}
]
}
Beside the bucket policy I also have in the ACL the standard full access to the bucket owner, which is the same Account I run my Athena Query from. I run my query in the AWS Management Console.
Not sure if related: AWS Glue Crawler is not able to read the files. But can list them, I get an error for every file.
What can I do to make the query work?
The problem in general is not AWS Athena, but the way I upload the files to S3
I do upload the data from an IoT-device and do this over an anonymous PUT-request. This might be not very secure, but for my use case it’s OK. But as John Rotenstein wrote in a comment on the question, if you do not set bucket-owner-full-access
to the upload, Athena will no be able to access the files.
Unfortunately, as far as I know, you can only fix this on the client side. On the AWS side you can enforce the Client to do so (see also John on https://stackoverflow.com/a/50402903/55070), but not change it on the aws side.
So in short: If you to an anonymous upload with HTTP to S3 you have to set bucket-owner-full-access
otherwise AWS Athena can not access the data no matter what ACL settings you use.