I am new to AWS. I have a user account and two roles, one for prod one for test.
Usually I log into my account and switch to prod role to run some simple select queries.
Now I want to use Athena locally in Python with PyAthena. I have tried the following resources from PyAthena documentation:
from pyathena import connect
import pandas as pd
conn = connect(aws_access_key_id='YOUR_ACCESS_KEY_ID',
aws_secret_access_key='YOUR_SECRET_ACCESS_KEY',
s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/',
region_name='us-west-2')
df = pd.read_sql("SELECT * FROM many_rows", conn)
print(df.head())
But always having the error
An error occurred (AccessDeniedException) when calling the StartQueryExecution operation: User: arn:aws:iam::xxxxxx:user/xxxx@xxxxx is not authorized to perform: athena:StartQueryExecution on resource: arn:aws:athena:ap-southeast-2:xxxxx:workgroup/primary
This is the exact error I would get if I run the same query using my user account without switching the role.
I have also tried to add a profile name parameter in connect
but still not working even though the env is correctly recognised.
Could someone help me how to do the 'switch' role step in local python code?
It seems like the issue is due to a missing role and not the profile_name
parameter. If you look at the Connection
class in pyathena, there's a role_arn
variable that you can specify while initializing the connection. Here's the line I'm talking about.
You might want to try it this way -
conn = connect(aws_access_key_id='YOUR_ACCESS_KEY_ID',
aws_secret_access_key='YOUR_SECRET_ACCESS_KEY',
s3_staging_dir='s3://YOUR_S3_BUCKET/path/to/',
region_name='us-west-2',
role_arn='<your arn here'>)
I haven't tested it myself though since I do not have an Athena setup.