hivehiveddl

Hive alter table column fails when it has struct column


I've created hive external table.

CREATE EXTERNAL TABLE test_db.test_table (
   `testfield` string,
   `teststruct` struct<teststructfield:string>
   )
 ROW FORMAT SERDE                                   
   'org.apache.hive.hcatalog.data.JsonSerDe'        
 STORED AS INPUTFORMAT                              
   'org.apache.hadoop.mapred.TextInputFormat'       
 OUTPUTFORMAT                                       
   'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
 LOCATION                                           
   'hdfs://some/path';


hive> describe test_table;

+-------------+---------------------------------+--------------------+
|  col_name   |            data_type            |      comment       |
+-------------+---------------------------------+--------------------+
| testfield   | string                          | from deserializer  |
| teststruct  | struct<teststructfield:string>  | from deserializer  |
+-------------+---------------------------------+--------------------+

and I want to alter table column.
but when table has struct column (teststruct), error occurs with < less than sign.

ALTER TABLE test_db.test_table CHANGE COLUMN testfield testfield2 string;

INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Starting task [Stage-0:DDL] in serial mode
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error: type expected at the position 7 of 'string:<derived from deserializer>' but '<' is found.

It succeed without struct column which has <. what should I do for this problem?


Solution

  • If nothing else helps, as a workaround you can drop/create table and recover partitions. The table is EXTERNAL and drop will not affect the data.

    (1) Drop table

    DROP TABLE test_db.test_table;
    

    (2) Create table with required column name

    CREATE EXTERNAL TABLE test_db.test_table (
       testfield2 string,
       teststruct struct<teststructfield:string>
       )
       PARTITIONED BY (....)
     ROW FORMAT SERDE                                   
       'org.apache.hive.hcatalog.data.JsonSerDe'        
     LOCATION                                           
       'hdfs://some/path';
    

    (3) Recover partitions

    MSCK REPAIR TABLE test_db.test_table;
    

    or if you are running Hive on EMR:

    ALTER TABLE test_db.test_table RECOVER PARTITIONS;