I am getting the following error when I try to query JSON file using a glue table via Athena :
HIVE_METASTORE_ERROR: Error: type expected at the position 0 of 'ARRAY <STRING>' but 'ARRAY' is found. (Service: null; Status Code: 0; Error Code: null; Request ID: null; Proxy: null)
This query ran against the "" database, unless qualified by the query.
The Glue table schema is as follows :
[
{
"Name": "id",
"Type": "string",
"Comment": ""
},
{
"Name": "eligibleterritorycodes",
"Type": "ARRAY <STRING>",
"Comment": ""
},
{
"Name": "excludedterritorycodes",
"Type": "ARRAY <STRING>",
"Comment": ""
},
{
"Name": "isdeleted",
"Type": "boolean",
"Comment": ""
},
{
"Name": "requestsuccessful",
"Type": "boolean",
"Comment": ""
}
]
And the bucket to which this Glue table points just has one file with a single line :
{"id":"-1000000102013416","eligibleTerritoryCodes":["00"],"excludedTerritoryCodes":["US"],"isDeleted":false,"requestSuccessful":true}
I have also tried setting ignore.malformed.json
as "true" for the glue table but to no success. I am not sure why it is giving such error when the glue table structure is exactly as the data in the file. Any help would be appreciated.
I could not get past this error but the following Athena query helped me resolve the issue and unblock me:
CREATE EXTERNAL TABLE odp_table (
dmid string,
eligibleterritorycodes array<string>,
excludedterritorycodes array<string>,
isdeleted boolean,
requestsuccessful boolean
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ('ignore.malformed.json' = 'true')
LOCATION 's3://odp-update-beta/final_request_models/"json_table" ';
select * from json_table;
So instead of using Glue table, I have used created an external table in athena with the same schema. This helpes me query the JSON file.