I am optimising an MPI code and I am working with Gprof. The problem is that the results I obtained are completely unreasonable. My workflow is the following:
compiling the code adding -pg
as a compilation flag.
running the code mpirun -np Nproc EXEC.exe arg1 ... argN
.
running gprof on the executable gprof EXEC.exe
What's wrong in this?
Instructions for running gprof
typically assume the program is serial, or single process but multi-threaded.
To run gprof
with a multi-process program like an MPI program, you'll want to
gprof
on the results.This blog post or these instructions at LLNL are good starting points:
GMON_OUT_PREFIX
environment variable, e.g. in bash, export GMON_OUT_PREFIX=gmon.out-
before running the mpiexec command (then, depending on the environment, you may have to run mpirun -x GMON_OUT_PREFIX -np Nproc EXEC.exe arg1 ... argN
to make sure each process has the environment variable)gprof -s EXEC.exe gmon.out-*
gprof EXEC.exe gmon.sum
(or just examine the individual files, or the files collectively, with gprof EXEC.exe gmon.out-12345
or gprof EXEC.exe gmon.out-*
)