I'm trying to run a POC on getting trino querying data into backblaze.
Following this example: https://github.com/bitsondatadev/trino-getting-started/blob/main/hive/trino-b2/README.md got me to the following issue:
SQL Error [16777216]: Query failed (#20240531_002231_00032_e9wd4): Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException getFileStatus on s3://temporalTests/data/data.csv: com.amazonaws.services.s3.model.AmazonS3Exception: (Service: Amazon S3; Status Code: 400; Error Code: 400 ; Request ID: f1a8d38c4e5afd88; S3 Extended Request ID: adUFuAGsobtRvT3evbss=; Proxy: null), S3 Extended Request ID: adUFuAGsobtRvT3evbss=:400 : (Service: Amazon S3; Status Code: 400; Error Code: 400 ; Request ID: f1a8d38c4e5afd88; S3 Extended Request ID: adUFuAGsobtRvT3evbss=; Proxy: null)
I am able to create the catalog through the sql console and the schema but it throws the error when creating the table.
CREATE CATALOG backblaze_catalog USING hive
WITH (
"hive.metastore.uri" = 'thrift://hive-metastore:9083', -- hive metastore created in other container
"hive.s3.aws-access-key" = 'KeyID',
"hive.s3.aws-secret-key" = 'AppKeyId',
"hive.s3.endpoint" = 'https://s3.us-west-123.backblazeb2.com',
"hive.s3.path-style-access"='true',
"hive.s3.region" = 'us-west-000',
"hive.non-managed-table-writes-enabled" = 'true',
"hive.storage-format" = 'CSV'
);
CREATE SCHEMA backblaze_catalog.raw_data
WITH (
"location" = 's3a://temporalTests/'
);
CREATE TABLE backblaze_catalog.raw_data.sample_data (
domain VARCHAR
)
WITH (
format = 'CSV',
external_location = 's3a://temporalTests/data/data.csv',
skip_header_line_count = 1
);
I've managed to test:
Is it something wrong I'm doing?
Thanks.
Your table definition references a file in B2 - it should reference a 'directory' (technically, a prefix, since directories don't exist in cloud object storage). Remove data.csv
from the table definition and it should work:
CREATE TABLE backblaze_catalog.raw_data.sample_data (
domain VARCHAR
)
WITH (
format = 'CSV',
external_location = 's3a://temporalTests/data/',
skip_header_line_count = 1
);