hivehadoop-yarnclouderaoozie

Oozie's job: yarn returns Error starting action [hive-4548]


There is a cluster with Cloudera including Hue. My need is the task for scheduler which send HQL-request to Hive. I'm trying to do task for oozie by web-constructor integrated in Hue.

My HQL request's file (request.hql):

INSERT INTO schema_child.table_child
SELECT * from shema_parent.table_parent LIMIT 5 ;

My XML file with the execution plan (workflow.xml):

<workflow-app name="hive-test" xmlns="uri:oozie:workflow:0.1">
    <action name="hive-test">
        <hive xmlns="uri:oozie:hive-action:0.1">
            <job-tracker>claster.site.com:8032</job-tracker>
            <name-node>hdfs://nsld3</name-node>           <script>/user/myname/oozie/hive_test/request.hql</script>
        </hive>
        <ok to="insert_into_table"/>
        <error to="kill_job"/>
    </action>
</workflow-app>

I've tried to change vars to direct link already:

${jobTracker} -> claster.site.com:8032
${nameNode} -> hdfs://nsld3:8020 

But yarn returns:

2021-05-24 18:01:33,162 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[claster.site.com] 
USER[username] GROUP[-] TOKEN[] APP[hive-test] JOB[0000012-210501174618258-oozie-oozi-W] 
ACTION[0000012-210501174618258-oozie-oozi-W@hive-4548] Error starting action [hive-4548].
ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: bad conf file: top-level element not ]

I'm a beginner in Hive so my work was based on docs, some examles like this and stack's answers like this.
Hive version 1.1.0
Oozie version 4.1.0

Questions:

  1. Why my oozie job doesn't work?
  2. How to use variables in script? Where oozie takes their meanings?

P.S. Sorry for my english.


Solution

  • If attached execution plan displays whole content of the workflow.xml then you need to add start, end and kill to it. Also hive action requires <job-xml> parameter with path to a Hive settings (usually it stores at /etc/hive/conf/hive-site.xml).

    Usually variables of the script are stored in a job.properties file, so parameters like jobTraker and nameNode are usually there. Also, you can define your own parameters in the block <parameters> in the beginning of the workflow.xml.

    Finally it should be something like that.

    <workflow-app name="hive-test-app" xmlns="uri:oozie:workflow:0.1">
        <parameters>
            <property>
                <name>jobTracker</name>
                <value>claster.site.com:8032</value>
            </property>
            <property>
                <name>nameNode</name>
                <value>hdfs://nsld3:8020</value>
            </property>
        </parameters>
        <start to="hive-test" />
        <action name="hive-test">
            <hive xmlns="uri:oozie:hive-action:0.1">
                <job-tracker>${jobTracker}</job-tracker>
                <name-node>${nameNode}</name-node>
                <job-xml>/etc/hive/conf/hive-site.xml</job-xml>   
                <script>/user/myname/oozie/hive_test/request.hql</script>
            </hive>
            <ok to="end"/>
            <error to="kill"/>
        </action>
        <end name="end"/>
        <kill name="kill"/>
    </workflow-app>