hadooptwitterhiveflume

hive creating table duplicate column name error


I am trying to analyze the Twitter data. When I tried to create a table by using the following command:

hive> CREATE external TABLE tweets (
       retweeted boolean, 
       createpapa string,
       place string,
       text string,
       retweeted_status  
       STRUCT<text:STRING,user:STRUCT<screen_name:STRING,name:STRING>,retweet_count:INT>,
       created_at string,
       place string,
       text string,
       entitles STRUCT<urls:ARRAY<STRUCT<expanded_url:STRING>>,user_mentions:ARRAY<STRUCT<screen_name:STRING,name:STRING>>,hashtags:ARRAY<STRUCT<text:STRING>>>,
       source string,
       retweet_count int,
       user STRUCT<locations:string,`following`:string,protected:boolean,verified:boolean,description:string,name:string,created_at:string,followers_count:int,url:string,friends_count:int,screen_name:string>)
       ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'
       LOCATION '/sparkEcosystem'; 

I am getting the following error:

FAILED: SemanticException [Error 10036]: Duplicate column name: place

Can anyone help me?


Solution

  • You wrote 'place string' twice. Remove one of them and run it again.

    CREATE external TABLE tweets (retweeted boolean,createpapa string,text string,retweeted_status STRUCT,retweet_count:INT>,created_at string,place string,text string,entitles STRUCT>,user_mentions:ARRAY>,hashtags:ARRAY>>,source string,retweet_count int,user STRUCT)ROW FORMAT SERDE 'com.cloudera.hive.serde.JSONSerDe'LOCATION '/sparkEcosystem';