I am delivering Cloudfront standard logs into one s3 bucket.
The bucket is created via terragrunt/form
terraform {
source = "github.com/terraform-aws-modules/terraform-aws-s3-bucket?ref=v3.15.0"
}
include "root" {
path = find_in_parent_folders()
expose = true
}
locals {
vars = include.root.locals.vars
app_name = basename(dirname(abspath(get_terragrunt_dir())))
bucket = "${include.root.locals.tags.Environment}-${local.app_name}-logs-${get_aws_account_id()}"
}
inputs = {
tags = include.root.locals.tags
bucket = local.bucket
acl = "log-delivery-write"
block_public_policy = true
control_object_ownership = true
object_ownership = "BucketOwnerPreferred"
attach_public_policy = false
attach_policy = true
policy = {
Statement = [
{
Sid = "ListObjectsInBucket"
Effect = "Allow"
Principal = {
Service = "athena.amazonaws.com"
}
Action = "s3:ListBucket"
Resource = "arn:aws:s3:::${local.bucket}"
},
{
Sid = "AllObjectsActions"
Effect = "Allow"
Principal = {
Service = "athena.amazonaws.com"
}
Action = "s3:*"
Resource = "arn:aws:s3:::${local.bucket}/*"
}
]
}
lifecycle_rule = [
{
id = "cloudfront-logs-expiration"
enabled = true
prefix = "" # Apply to all objects
tags = {} # Apply regardless of tags
expiration = {
days = 30
}
}
]
force_destroy = local.vars.env != "prod" ? true : false
}
The logs are delivered perfectly fine from cloudfront -> s3 above
I create the table in AWS ATHENA with this query: (of course using my s3 bucket)
CREATE EXTERNAL TABLE IF NOT EXISTS cloudfront_standard_logs (
`date` DATE,
time STRING,
x_edge_location STRING,
sc_bytes BIGINT,
c_ip STRING,
cs_method STRING,
cs_host STRING,
cs_uri_stem STRING,
sc_status INT,
cs_referrer STRING,
cs_user_agent STRING,
cs_uri_query STRING,
cs_cookie STRING,
x_edge_result_type STRING,
x_edge_request_id STRING,
x_host_header STRING,
cs_protocol STRING,
cs_bytes BIGINT,
time_taken FLOAT,
x_forwarded_for STRING,
ssl_protocol STRING,
ssl_cipher STRING,
x_edge_response_result_type STRING,
cs_protocol_version STRING,
fle_status STRING,
fle_encrypted_fields INT,
c_port INT,
time_to_first_byte FLOAT,
x_edge_detailed_result_type STRING,
sc_content_type STRING,
sc_content_len BIGINT,
sc_range_start BIGINT,
sc_range_end BIGINT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 's3://DOC-EXAMPLE-BUCKET/'
TBLPROPERTIES ( 'skip.header.line.count'='2' )
The above is documented here https://docs.aws.amazon.com/athena/latest/ug/cloudfront-logs.html also https://repost.aws/knowledge-center/analyze-logs-athena
So far, so good!!! the table is created.
The problem happens when I try to query data:
SELECT
sc_status,
COUNT(*) AS request_count
FROM
cloudfront_standard_logs
GROUP BY
sc_status
How can I allow AWS Athena to access this bucket and fix the error ? What changes should I make in my terragrunt code policy for it. I don't understand why the permissions I gave are not enough
The terragrunt/form policy above is converted and attached as this json to the bucket
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListObjectsInBucket",
"Effect": "Allow",
"Principal": {
"Service": "athena.amazonaws.com"
},
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::secret"
},
{
"Sid": "AllObjectsActions",
"Effect": "Allow",
"Principal": {
"Service": "athena.amazonaws.com"
},
"Action": "s3:*",
"Resource": "arn:aws:s3:::secret*"
}
]
}
I fixed the issue but not 100% HOW and WHY
acl = "log-delivery-write"
MAGIC, now athena can access the files in my new bucket..