hadoopaffinity

Core affinity of Map Tasks in Hadoop


Question: Does Hadoop v.1.2.1 or v.2 (YARN) offer a way to determine core affinity, of different Map Tasks, within a single job? In other words, can I pin a specific Map Task to a specific core, in a similar way to Linux's taskset, or is it out of hadoop's control and up to the Linux Scheduler?

I am relatively new to Map Reduce programming and my project involves studying its performance when different parameters (machine- or network-specific) are altered. I have so far undergone through its official documentation (v.1.2.1) and numerous threads both online and Stack Exchange.

Below I'm providing two different cases, to better illustrate my question, along with my so-far research.


Example #1: Suppose I have the following configuration:

According to the block size, 2 GiB / 64 MiB = 32 Map Tasks will be called. If mapred.tasktracker.map.tasks.maximum is set to 16 then exactly 16 Map tasks will be run on node #1 and 16 will run on node #2, with 16 cores per node to spare. (links: #1, #2)

As far as I have found, there is no way to directly control "node" affinity, i.e., how to map "Map tasks" to specific nodes (link), apart from its "Rack awareness" (link). However, within a specific node, may I...

Question #1: ... "pin" each Map Task to a specific core? Question #2: ... guarantee that each Map Task will stay on the core it started on? Or is it out of hadoop's control and dependent on the Linux Scheduler?


Example #2: suppose Example #1's configuration, but with an input size of 8 GiB, resulting in 128 Map Tasks.

Question #1: regardless of the value of mapred.tasktracker.map.tasks.maximum, will the 128 Map Tasks be called simultaneously? Is it correct that, since I have in total 64 Map slots (over 2x nodes), each node will on average handle 2x Map Tasks per core?

Question #2: if Question #1 is correct, do I have any control (within a single node) over "how much time" a Map Task will stay on a single core and if it will be reallocated to the same core, or is it out of hadoop's control and up to the Linux Scheduler?


Concerning the reduce tasks, I assume that the related answers would hold, as well, i.e., core affinity would be also possible (or not).


Solution

  • This paper provides some insight on task-core affinity - On the Core Affinity and File Upload Performance of Hadoop

    The paper mentions that POSIX standard defines sched_setaffnity() system call to decide the process(or task in this case) to core affinity at the user level.

    But I would appreciate an easier way to define task-core affinity.