hadoophadoop-yarnmrjob

mapreduce job failes on hadoop cluster with subprocess failed with code 1


I have a Hadoop 3.2.2 Cluster with 1 namenode/resourceManager and 3 datanodes/NodeManagers.

this is my yarn-site config

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>bd-1</value>
</property>

<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

<property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

When I run the example job

python mr_word_count.py -r hadoop -v hdfs:///user/hduser/testme.txt

I have this error

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:326)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:539)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)

What have I done so far:

I can define python bin in .mrjob.conf but then the error code changes to 126

in the Console I see map 100% reduce 100% In the WebUI I also see that the Job is processing, CPU and Memory is consumed by the job.

I'm googling and reading stackoverflow/haddop documentation now since 4 Days for many many hours without a result. any ideas what could be wrong?


Solution

  • I forgot to install mr_job on all nodes...

    run this on all nodes fixed the problem: pip3 install MRJob