I'm using CDH5.4
. I'm running a hadoop job which from command line appears to be ok (when simply running with hadoop jar
). However if I run it from yarn
It finishes silently with a single mapper and no reducers. I really suspect both 'runs' were running the same exact command. However, I want to be sure of that. So I look at the logs at:
(note its a scalding
job with custom runner - all is fine when I run it from command line).
/container_1432733015407_0953_01_000001/container_1432733015407_0953_01_000001/user/stdout/?start=0
and I saw something like:
Main class : org.apache.oozie.action.hadoop.JavaMain
Maximum output : 2048
Arguments :
-D
oneparam=value
-D
secondparam=value
so i took these and turned into a command line.
and ran it with something like
hadoop jar MyScaldingRunner -D oneparam=value -D secondparam=value
and it ran just fine and produced the results.
Is there a way for me to view the SAME EXACT hadoop jar
command line that the hadoop was running when it was executed via oozie
+ yarn
to run it? because from over there it just finishes silently!
I don't have direct answer to your question but JDiagnostics could help you to recreate the parameters needed, like classpath or environment variables. Here is an example you can put in the beginning of your program before you run it:
LOG.info(new DefaultQuery().call())