I am trying to do Julia MPI which is embedded in C code as follows. The MPI seems to work fine within C itself but whenever I tried getting the rank in Julia, it crashes. The program complains the communicator is invalid. Can anyone help me? I am using Open MPI 4.1.3
Below is the minimal example that shows the problem on my machine. Basically, it cannot even get the size or rank of MPI.COMM_WORLD
in julia.
#include <mpi.h>
#include <stdio.h>
#include <julia.h>
int main(int argc, char *argv[]) {
MPI_Init(&argc, &argv);
jl_init();
(void) jl_eval_string("println(\"Loading MPI...\")");
(void) jl_eval_string("using MPI");
(void) jl_eval_string("println(\"Done.\")");
(void) jl_eval_string("if MPI.Initialized() ; println(\"MPI is initialized.\") ; else ; println(\"Warning: MPI is not initialized.\") ; end ");
(void) jl_eval_string("comm = MPI.COMM_WORLD");
(void) jl_eval_string("println(comm)");
(void) jl_eval_string("println(MPI.Comm_size(comm))");
jl_atexit_hook(0);
MPI_Finalize();
return 0;
}
Compile with the code below:
mpicc main.c -I$JULIA_INC -L$JULIA_LIB -ljulia -o run.exe
And run with
mpirun -np 2 ./run.exe
My output
Loading MPI...
Done.
MPI is initialized.
MPI.Comm(1140850688)
[4188745] signal (11.1): Segmentation fault
in expression starting at none:1
PMPI_Comm_size at /home/t2hsu/miniconda3/envs/mpi/lib/libmpi.so.40 (unknown line)
MPI_Comm_size at /home/t2hsu/.julia/packages/MPI/TKXAj/src/api/generated_api.jl:999 [inlined]
Comm_size at /home/t2hsu/.julia/packages/MPI/TKXAj/src/comm.jl:78
jfptr_Comm_size_591 at /home/t2hsu/.julia/compiled/v1.9/MPI/nO0XF_FB87d.so (unknown line)
_jl_invoke at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2758 [inlined]
ijl_apply_generic at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/gf.c:2940
jl_apply at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/julia.h:1880 [inlined]
do_call at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:126
eval_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:226
eval_stmt_value at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:177 [inlined]
eval_body at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:624
jl_interpret_toplevel_thunk at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/interpreter.c:762
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:912
jl_toplevel_eval_flex at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:856
ijl_toplevel_eval_in at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/toplevel.c:971
ijl_eval_string at /cache/build/default-amdci5-5/julialang/julia-release-1-dot-9/src/jlapi.c:113
main at ./run_c.exe (unknown line)
__libc_start_main at /lib64/libc.so.6 (unknown line)
_start at ./run_c.exe (unknown line)
Allocations: 2997 (Pool: 2985; Big: 12); GC: 0
Segmentation fault (core dumped)
main.c
#include <mpi.h>
#include <stdio.h>
#include <julia.h>
int main(int argc, char *argv[]) {
int rank, size;
char cmd[1024];
// Initialize the MPI environment
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("Hello from process %d of %d\n", rank, size);
int comm_id = MPI_Comm_c2f(MPI_COMM_WORLD);
jl_init();
(void) jl_eval_string("using MPI");
(void) jl_eval_string("if MPI.Initialized() ; println(\"MPI is initialized.\") ; else ; println(\"Warning: MPI is not initialized.\") ; end ");
sprintf(cmd, "comm = MPI.Comm(%d)", comm_id);
printf("Goinig to evaluate:\n");
printf(cmd);
printf("\n");
(void) jl_eval_string(cmd);
(void) jl_eval_string("println(comm)");
(void) jl_eval_string("println(MPI.Comm_rank(comm))");
jl_atexit_hook(0);
MPI_Finalize();
return 0;
}
Compile with the code below:
mpicc main.c -I$JULIA_INC -L$JULIA_LIB -ljulia -o run.exe
And run with
mpirun -np 2 ./run.exe
However, I got the error output:
Hello from process 0 of 2
Hello from process 1 of 2
MPI is initialized.
Goinig to evaluate:
comm = MPI.Comm(0)
MPI is initialized.
Goinig to evaluate:
comm = MPI.Comm(0)
MPI.Comm(0)
[exp-18-53:1727695] *** An error occurred in MPI_Comm_rank
[exp-18-53:1727695] *** reported by process [1988952065,1]
[exp-18-53:1727695] *** on communicator MPI_COMM_WORLD
[exp-18-53:1727695] *** MPI_ERR_COMM: invalid communicator
[exp-18-53:1727695] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[exp-18-53:1727695] *** and potentially your MPI job)
I found the problem. It is because my Julia MPI does not use the same MPI library as C.
I solved this by using MPIPreferences
as suggested in MPI.jl documentation to re-configure the targeting MPI library.
(void) jl_eval_string("using MPIPreferences");
(void) jl_eval_string("MPIPreferences.use_system_binary(; library_names=[\"/home/t2hsu/miniconda3/envs/mpi/lib/libmpi\"]);");