Its a first time am running mapreduce program from Oozie.
Here is my job.properties file
nameNode=file:/usr/local/hadoop_store/hdfs/namenode
jobTracker=localhost:8088
queueName=default
oozie.wf.applications.path=${nameNode}/Config
Here is my hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Here is my core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
</configuration>
But when I run Ozzie command to run my Mapreduce program, Its give error that lib
folder is not found. Error: E0405 : E0405: Submission request doesn't have any application or lib path
oozie job -oozie http://localhost:11000/oozie -config job.properties -run
I've created Config
folder in HDFS
and in that folder created lib
folder too. In lib
folder placed my mapreduce jar file and inside Config
folder placed my workflow.xml
file. (Its all in HDFS)
I think I ve give wrong HDFS path (nameNode
) in job.propertie
s file. That's why its not able to find {nameNode}/Config
, May I know please what would be hdfs path ..?
Thanks
Update - 1 job.properties
nameNode=hdfs://localhost:8020
jobTracker=localhost:8088
queueName=default
oozie.wf.applications.path=${nameNode}/Config
still getting same error:
Error: E0405 : E0405: Submission request doesn't have any application or lib path
Update - 2 workflow.xml
in Config
folder in HDFS.
<workflow-app xmlns="uri:oozie:workflow:0.4" name="simple-Workflow">
<start to="RunMapreduceJob" />
<action name="RunMapreduceJob">
<map-reduce>
<job-tracker>localhost:8088</job-tracker>
<name-node>file:/usr/local/hadoop_store/hdfs/namenode</name-node>
<prepare>
<delete path="file:/usr/local/hadoop_store/hdfs/namenode"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>default</value>
</property>
<property>
<name>mapred.mapper.class</name>
<value>DataDividerByUser.DataDividerMapper</value>
</property>
<property>
<name>mapred.reducer.class</name>
<value>DataDividerByUser.DataDividerReducer</value>
</property>
<property>
<name>mapred.output.key.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<property>
<name>mapred.output.value.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapred.input.dir</name>
<value>/data</value>
</property>
<property>
<name>mapred.output.dir</name>
<value>/dataoutput</value>
</property>
</configuration>
</map-reduce>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Mapreduce program Failed</message>
</kill>
<end name="end" />
</workflow-app>
The <namenode>
tag should not be a file path. It should point to the NameNode of the underlying Hadoop cluster where Oozie has to run the MapReduce job. Your name node should be the value of the fs.default.name from your core-site.xml.
nameNode=hdfs://localhost:9000
Also, change the property name oozie.wf.applications.path to oozie.wf.application.path (without the s).
Add the property oozie.use.system.libpath=true
to your properties file.
Source: Apache Oozie by Mohammad Kamrul Islam & Aravind Srinivasan