I'm using Firehose to put records in Parquet format in an S3 bucket. I've manually defined a glue table.
So I've got a manifest like
{
"entries": [
{"url":"s3://my-bucket/file1.parquet"},
{"url":"s3://my-bucket/file2.parquet"}
]
}
And a copy command like
COPY schema_name.table_name
FROM 's3://my-bucket/manifest.json'
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456:role/RoleWithPermissionToRedshiftAndBucket'
PARQUET
MANIFEST;
And it gives this mysterious error that has 0 results on Google.
[XX000][500310] [Amazon](500310) Invalid operation: COPY with MANIFEST parameter requires full path of an S3 object.
Details:
-----------------------------------------------
error: COPY with MANIFEST parameter requires full path of an S3 object.
code: 8001
context:
query: 23514459
location: scan_range_manager.cpp:795
process: padbmaster [pid=108497]
-----------------------------------------------;
It seems to me that I am definitely specifying the full path, so I'm not sure what's up.
One thing that was wrong was that the bucket was in a different region, which would also prevent it from working.
One reason you might get this error message is if the bucket is in another aws account.
But what actually fixed it for me was adding content_length to the manifest, since it is required for parquet.
{
"entries": [
{
"url":"s3://my-bucket/file1.parquet",
"mandatory":true,
"meta":{
"content_length":2893394
}
},
{
"url":"s3://my-bucket/file2.parquet",
"mandatory":true,
"meta":{
"content_length":2883626
}
}
]
}
Apparently, if you leave content_length out, you'll get an unrelated error message. This guy made the same mistake and got an error message saying
File has an invalid version number
Error while loading parquet format file into Amazon Redshift using copy command and manifest file