compilationopenmpi

Running OpenMPI program without mpirun


I'm using gcc and OpenMPI. Usually I run MPI programs using the mpirun wrapper -- for example,

mpirun -np 4 myprogram

to start 4 processes.

However, I was wondering if it's possible to easily generate a binary which will do that automatically (maybe with some hardcoded options like -np 4 above).

I know I can write a C wrapper that calls my program, such as the following:

#include <stdlib.h>
#include <unistd.h>

int main() {
        char *options[] = { "mpirun", "-np", "4", "myprogram" };

        execvp("mpirun", options);
        /* Ignoring return value to keep example simple */

        return EXIT_SUCCESS;
}

but this seems a bit clumsy and I end up with two executables instead of one.

I have tried to explicitly link the MPI libraries, like

gcc -o myprogram -I/usr/lib/openmpi/include/ \
    -lmpi -L/usr/lib/openmpi/lib/ myprogram.c

but the when I run resulting executable, MPI_Comm_size sets zero as the group size (as if I had given -np 0 as argument). Can I use an environment variable or something else to pass the group size? Or, is there another way to build a single-executable MPI program (using Linux and gcc)?


Solution

  • If I get it correctly, you want a self-launching MPI executable. As I have written in my comment, you can go with a special option that makes your code execute mpirun if supplied, e.g. -launchmpi. With Open MPI it is even easier since it exports special environment variables to launched MPI processes, e.g. OMPI_COMM_WORLD_RANK. If this variable exists in the environment, then you know that the program was launched from mpirun and not directly. You can combine both methods in a single check this:

    int main (int argc, char **argv)
    {
        int perform_launch = 0;
        // Scan argv[] for special option like "-launchmpi"
        // and set perform_launch if found 
    
        if (perform_launch || getenv("OMPI_COMM_WORLD_RANK") == NULL)
        {
            // #args = argc + 3 ("mpirun -np 4" added) + NULL
            // #args should be reduced by one if "-launchmpi" is present
            char **args = (char **)calloc(
               argc + (perform_launch ? 3 : 4),
               sizeof(char *));
            args[0] = "mpirun";
            args[1] = "-np";
            args[2] = "4";
            // Copy the entire argv to the rest of args but skip "-launchmpi"
    
            execvp("mpirun", args);
    
            return EXIT_SUCCESS;
        }
    
        // Proceed as regular MPI code
        MPI_Init(&argc, &argv);
        ...
        // Magic happens here
        ...
        MPI_Finalize();
    
        return EXIT_SUCCESS;
    }
    

    If you'd like to control the number of processes in the MPI job, you can supply it as an additional arugment, e.g. -launchmpi 12, or in an environment variable and use its value instead of "4" in the above code.

    Note that MPI executables cannot be generally launched without mpirun. The latter is an integral part of the MPI run-time and it does much more that just launching multiple copies of the MPI executable. Also you are always linking explicitly to the MPI library when compiling with any of the MPI compiler wrappers (try mpicc -showme). Although you can link MPI libraries statically (not recommended, see here), you will still need mpirun in order to be able to run MPI jobs - AFAIK there is no way to embed mpirun functionality in your program, at least not in Open MPI.