hiveairflowhive-configuration

how to pass/read hiveconfs variable inside hql file


using hive operator and passing hql file run hql file in airflow

hive_task = HiveOperator(
    task_id='run_hql',
    hql='/path/to/your_query.hql',
    hive_conn_id='your_hive_connection_id',
    hiveconf_jinja_translate=True,
    hiveconfs={'table_name': 'your_table_name', 'days': 5},
    dag=dag,
)

And hql content as below

-- Define parameters
SET hivevar:table_name=YOUR_DEFAULT_TABLE_NAME;
SET hivevar:days_offset=YOUR_DEFAULT_DAYS_OFFSET;

-- Use parameters in the query
SET days={{ macros.dateutil.relativedelta.relativedelta(days=-hivevar:days) | replace(":", "-") }};

ANALYZE TABLE edw_health.${hivevar:table_name} PARTITION(ds = '{{ days }}') COMPUTE STATISTICS;
ANALYZE TABLE edw_health.${hivevar:table_name} PARTITION(ds = '{{ days }}') COMPUTE STATISTICS FOR COLUMNS;

getting error as 'hiveconf' is undefined.

Could any one suggest if something i am missing and how to read the hive confs variable values inside hql file


Solution

  • You are using hivevar namespace in the Hive script:

     ${hivevar:table_name}
    

    But hiveconfs are passed as key=value in hiveconf namespace.

    See hive_operator

    Use hiveconf instead of hivevar:

     ${hiveconf:table_name}