hadoopoozie

Dynamically calculating oozie parameter (number of reducers for MR action)


In my oozie workflow I dynamically create a hive table, say T1. This hive action is then followed by a map-reduce action. I want to set number of reducers property (mapred.reduce.tasks) equal to distinct values of a field say (T1.group). Any ideas how to set value of some oozie parameter dynamically and how to get value of the parameter from hive distinct action to oozie parameter?


Solution

  • I hope this can help:

    1. Create the hive table as you are doing already.
    2. Execute another Hive query which calculates the distinct values for the column and writes it to a file in hdfs.
    3. Create an Shell action, which will read the file and echo the value in the form of key=value. Enable the capture-output for the shell action.
    4. This is your MR action. Now access the action data using the Oozie EL functions. e.g. ${wf:actionData('ShellAction')['key']}, pass this value to the mapred.reduce.tasks in the configuration tag of the MR action.