I have a Hive Table with 2 columns.Employee ID and Salary.
Data is something like given below.
Employee ID Salary
1 10000.08
2 20078.67
3 20056.45
4 30000.76
5 10045.14
6 43567.76
I want to create Partitions based on Salary Column.For Example Partition for salary range 10000 to 20000, 20001 to 30000.
How do i achieve this.
Hive does not support range partitioning, but you can calculate ranges during data load.
Create table partitioned by salary_range:
create table your_table
(
employee_id bigint,
salary double
)
partitioned by (salary_range bigint)
insert using case for salary range calculation:
insert overwrite table your_table partition (salary_range)
select employee_id, salary,
case
when salary between 10000 and 20000 then 20000
when salary between 20001 and 30000 then 30000
...
else ...
end as salary_range
from some_table;