hiveprofilingapache-hivejava-mission-control

Profiling Apache Hive CLI


This link Profling Hive CLI provides an instruction on how to profile the Hive CLI using Java mission control. And the steps are

  1. Create a directory to save profiler outputs:mkdir $HOME/profiles

  2. Create an alias so that it would be easier to repeat: alias debug='HADOOP_CLIENT_OPTS="-XX:+UnlockCommercialFeatures -XX:+FlightRecorder -XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,dumponexitpath=$HOME/profiles/"'

  3. Run some hadoop client command to profile For example, to profile Hive CLI startup (so that using -e 'exit;') with also TRACE output: debug hive --hiveconf hive.root.logger=TRACE,console -e 'exit;' 2&>&1 | tee $HOME/profiles/hive_trace.out

  4. Archive and collect the directory used in step 1 tar czvf profile_data.tgz $HOME/profiles

My questions are

a) After step 4, how does one use java mission control to consume the collected metrics

b) When I start hive using the configuration settings in 2 and 3. Why is Hive not visible in the java mission control console?

c) Is there a better way to profile Hive's component like the hive-exec, hive-metastore?


Solution

  • a) You should now have a number for *.jfr files in $HOME/profiles, those can be opened and analyzed in JMC. Heres a link to the official docs on how to do this: https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr005.html (there is lots more info and videos about this if you search online)

    b) How do you start Hive, is it with the same user that you run JMC with? Can you see other JVMs on the system? If you run jps or jcmd, can you see the Hive process listed there?