hadoopapache-sparkhadoop-yarnelastic-map-reduce

Can I force YARN to use the master node for the Application Master container?


My big ol' master node hardware is doing practically nothing during my Hadoop/Spark runs because YARN uses a random slave node for its AM on each task. I like the old Hadoop 1 way better; lots of log chasing and ssh pain was avoided that way when things went wrong.

Is it possible?


Solution

  • It's possible with Spark and YARN node labels.

    1. Labelize your nodes
    2. Use spark.yarn.am.nodeLabelExpression properties

    Good to read: