jsonamazon-athenaspacesfield-names

How to handle spaces in filed names in a json file while creating Athena table


My Json file has field names with spaces in it like "Customer ID".The json file sits in the S3 bucket .So,when I try creating an Athena table on this json file ,it throws me error as the field names has spaces.it loads fine when the fields with spaces are removed while loading.How do I handle this situation so the entire data gets properly loaded.


Solution

  • if you have a chance convert your json file to csv

    you can try something like :

    CREATE EXTERNAL TABLE IF NOT EXISTS db.table_name (
        .......
        `Customer_ID` int , 
        .......
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    WITH SERDEPROPERTIES (
        'separatorChar' = ',',
        'quoteChar' = '\"',    
        'escapeChar' = '\\' ) 
    STORED AS TEXTFILE LOCATION 's3://location'
    TBLPROPERTIES ('skip.header.line.count'='1')
    

    main thought is  - TBLPROPERTIES ('skip.header.line.count'='1') here you skip header in you .csv file and set your custom column name