I tried creating an table on aws-athena with hive on parquet data with following :
CREATE TABLE IF NOT EXISTS db.test (
country STRING ,
day_part STRING ,
dma STRING ,
first_seen STRING,
geohash STRING ,
last_seen STRING,
location_backfill ARRAY <
element STRUCT <
backfill_type: BIGINT,
brq: BIGINT ,
first_seen: STRING,
last_seen: STRING ,
num_days: BIGINT >>
)
ROW FORMAT SERDE
'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION
's3://<location>'
TBLPROPERTIES (
'parquet.compress'='SNAPPY',
'transient_lastDdlTime'='<sometime>')
i repeatedly get the error
line 9:12: mismatched input 'struct' expecting {'(', 'array', '>'} (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: )
The syntax seems fine and not sure. The data is stored in s3 path any idea what may be causing this problem?
Array elements are not named, specify only type (struct):
location_backfill ARRAY <
STRUCT <
backfill_type: BIGINT,
brq: BIGINT ,
first_seen: STRING,
last_seen: STRING ,
num_days: BIGINT >>