using hive operator and passing hql file run hql file in airflow
hive_task = HiveOperator(
task_id='run_hql',
hql='/path/to/your_query.hql',
hive_conn_id='your_hive_connection_id',
hiveconf_jinja_translate=True,
hiveconfs={'table_name': 'your_table_name', 'days': 5},
dag=dag,
)
And hql content as below
-- Define parameters
SET hivevar:table_name=YOUR_DEFAULT_TABLE_NAME;
SET hivevar:days_offset=YOUR_DEFAULT_DAYS_OFFSET;
-- Use parameters in the query
SET days={{ macros.dateutil.relativedelta.relativedelta(days=-hivevar:days) | replace(":", "-") }};
ANALYZE TABLE edw_health.${hivevar:table_name} PARTITION(ds = '{{ days }}') COMPUTE STATISTICS;
ANALYZE TABLE edw_health.${hivevar:table_name} PARTITION(ds = '{{ days }}') COMPUTE STATISTICS FOR COLUMNS;
getting error as 'hiveconf' is undefined.
Could any one suggest if something i am missing and how to read the hive confs variable values inside hql file
You are using hivevar
namespace in the Hive script:
${hivevar:table_name}
But hiveconfs are passed as key=value in hiveconf
namespace.
See hive_operator
Use hiveconf instead of hivevar:
${hiveconf:table_name}