javalinuxhadoopyourkit

how to profile hadoop tasks with yourkit


I'm trying to profile the memory usage of my hadoop job.

Could someone provide a step by step how-to on how to monitor hadoop tasks with yourkit - including setup?


Solution

  • All you have to do is add the following entry to your mapred-site.xml file(which if found in $HADOOP_HOME/conf/, where $HADOOP_HOME is your Hadoop installation directory):

    <property>
      <name>mapred.child.java.opts</name>
      <value>
     -agentpath:{yourkit installation directory}/bin/linux-x86-64/libyjpagent.so=tracing,dir={output directory}
     </value>
    </property>
    

    If you are running on a platform different from linux-x86-64, you might need to change above value to match your platform(see this for details)

    You can pass any of the options listed here to the profiler agent

    This will create a number of snapshots, one for each Child process in the specified output directory