I have created a script of hive queries mainly for features creation and scoring for cross sell project. Most of the queries are simple queries that do the data cleaning , transformation etc. I want to automate this process so that I can start with hive table as input and can output the final result into Hbase file . My question are :
What is the best way to do it ?
Can I simply create filename.sql
or filename.hql
and run it from shell using hive -f filename.sql
Is there something in hive like PL for SQL?
You can do it in multiple ways. Like you can also use Hive CLI and its very ease to do such jobs. You can write shell script in Linux or .bat in Windows.
In script you can simply go like below entries.
$HIVE_HOME/bin/hive -e 'select a.col from tab1 a';
or if you have file :
$HIVE_HOME/bin/hive -f /home/my/hive-script.sql
Make sure you have set $HIVE_HOME in your env. Once you have tested and working fine you can put in cronjob for scheduling.