apache-hive

Insert data into avro-formatted, partitioned hive table with data from HDFS


I have created a hive table named employee (avro formatted) with partition on department.

I have the avro dataset in my HDFS location. My dataset is also having department id.

I would like to import the data into Hive table with the data from HDFS. During the import, I want the data to be kept in its respective partition.

How to achieve this? any idea?


Solution

  • There are 2 ways of doing it.

    1.Manual partitioning

    load data inpath hdfs path into table employee_table partition(deptId='1')

    load data inpath hdfs path into table employee_table partition(deptId='2')

    2.Dynamic partitioning

    a. Create a intermediate table

    b. Create a employee table with partition

    c. Load data from intermediate table to partition table