mpihpcgprof

How do I get meaningful results from gprof on an MPI code?


I am optimising an MPI code and I am working with Gprof. The problem is that the results I obtained are completely unreasonable. My workflow is the following:

What's wrong in this?


Solution

  • Instructions for running gprof typically assume the program is serial, or single process but multi-threaded.

    To run gprof with a multi-process program like an MPI program, you'll want to

    1. make sure each process outputs its own file
    2. explicitly sum the files across processes
    3. run gprof on the results.

    This blog post or these instructions at LLNL are good starting points:

    1. Set the poorly documented GMON_OUT_PREFIX environment variable, e.g. in bash, export GMON_OUT_PREFIX=gmon.out- before running the mpiexec command (then, depending on the environment, you may have to run mpirun -x GMON_OUT_PREFIX -np Nproc EXEC.exe arg1 ... argN to make sure each process has the environment variable)
    2. Use gprof itself to collect and sum the results, gprof -s EXEC.exe gmon.out-*
    3. gprof EXEC.exe gmon.sum (or just examine the individual files, or the files collectively, with gprof EXEC.exe gmon.out-12345 or gprof EXEC.exe gmon.out-*)