I have created a hive table named employee (avro formatted) with partition on department.
I have the avro dataset in my HDFS location. My dataset is also having department id.
I would like to import the data into Hive table with the data from HDFS. During the import, I want the data to be kept in its respective partition.
How to achieve this? any idea?
There are 2 ways of doing it.
1.Manual partitioning
load data inpath hdfs path
into table employee_table partition(deptId='1')
load data inpath hdfs path
into table employee_table partition(deptId='2')
2.Dynamic partitioning
a. Create a intermediate table
b. Create a employee table with partition
c. Load data from intermediate table to partition table