hivehiveqlcloudera-manager

hive doesn't support merge function


trying to update the value from table to another table, both of these tables have the same field name but different values, the query must be work fine on any normal DB but here it returns

Error while compiling statement: FAILED: ParseException line 1:0 cannot recognize input near 'MERGE' 'INTO' 'FINAL'

MERGE 
INTO FINAL 
USING FIRST_STAGE
ON IMSI = FIRST_STAGE.IMSI and Site = FIRST_STAGE.Site
WHEN MATCHED THEN UPDATE SET
  Min_Date = least(FIRST_STAGE.Min_Date, Min_Date),
  Max_Date = greatest(FIRST_STAGE.Max_Date, Max_Date),
  NoofDays = FIRST_STAGE.NoofDays + NoofDays,
  Down_Link = FIRST_STAGE.Down_Link + Down_Link,
  up_Link = FIRST_STAGE.up_Link + up_Link,
  connection = FIRST_STAGE.connection + connection
WHEN NOT MATCHED THEN INSERT ( Min_Date, 
  Max_Date, 
  NoofDays, 
  IMSI, 
  Site, 
  Down_Link, 
  Up_Link, 
  Connection )
VALUES ( FIRST_STAGE.Min_Date, 
  FIRST_STAGE.Max_Date, 
  FIRST_STAGE.NoofDays, 
  FIRST_STAGE.IMSI, 
  FIRST_STAGE.Site, 
  FIRST_STAGE.Down_Link, 
  FIRST_STAGE.Up_Link, 
  FIRST_STAGE.Connection )

Solution

  • Hive merge statement is introduced in Hortonworks distribution.

    AFAIK In case of Cloudera distribution we need to use Kudu to perform upsert operations starting from cloudera-5.10+.

    Note: Upsert statement only works for Impala tables that use the Kudu storage engine.