I'm trying to query a parquet file uploaded to Amazon S3 from SQL Server 2022 using OPENROWSET
.
I can confirm that I can access the S3 bucket with those access keys because I ran a python script to generate and upload the parquet file to S3 successfully.
I can verify the file is present in S3 by logging into AWS:
When I try to query the file from SQL, I get this error:
File 's3://sqlexternaltabletest/parquet_folder/1936a792b4f94a9283e8f22354171fd7-0.parquet' cannot be opened because it does not exist or it is used by another process
S3 bucket name = "sqlexternaltabletest"
Here's my SQL code:
CREATE DATABASE SCOPED CREDENTIAL s3_dc
WITH IDENTITY = 'S3 Access Key',
SECRET = 'AccessKey:SecretKey';
CREATE EXTERNAL DATA SOURCE s3_ds
WITH
(
LOCATION = 's3://s3.amazonaws.com/sqlexternaltabletest/'
, CREDENTIAL = s3_dc
);
GO
SELECT *
FROM OPENROWSET
(
BULK '/parquet_folder/1936a792b4f94a9283e8f22354171fd7-0.parquet'
, FORMAT = 'PARQUET'
, DATA_SOURCE = 's3_ds'
) AS [cc];
Any idea what's wrong?
I was having the same issue, but did not have the region in the URL. I notice your URL has the same issue.
Make sure you have the region in your s3 location like this:
CREATE EXTERNAL DATA SOURCE s3_ds
WITH
( LOCATION = 's3://some-bucket.s3.us-west-2.amazonaws.com/prefix',
, CREDENTIAL = s3_dc
)