I am trying to combine MPI one sided communication with OpenMP tasks. The program sets up a cartesian communication and communicates some data. Without the OpenMp parts everthing works fine. Here is the code with OpenMP:
#pragma omp parallel
#pragma omp single
{
MPI_Win_post(neighbour_group, 0, win_src);
MPI_Win_start(neighbour_group, 0, win_src);
#pragma omp task
{
MPI_Put(...);
MPI_Put(...);
MPI_Put(...);
MPI_Put(...);
MPI_Win_complete(win_src); // All Puts done
cout << rank << " complete." << endl;
}
#pragma omp parallel for schedule(dynamic,1)
for (auto row_index = 0; row_index < rows; row_index++) {
// do more stuff
}
cout << rank << " waiting." << endl;
MPI_Win_wait(win_src);
cout << rank << " waiting finished." << endl;
// do even more stuff
MPI_Allreduce(...);
}
it results in some MPI processes not being able to reach the MPI_Win_complete(), hence the MPI_Win_wait() never finishes. So the problem seems to be with post and put already. Also to be mentioned is, that I am using OpenMPI to run the code.
My question is, if there are some caveats I am not aware of and/or if there is a flaw in the code.
I tried:
By default MPI is not aware of other threads, and this makes problems with the calls to MPI_Win_put(...) that might not come from the MPI process, but from an omp thread. MPI can be initialized being aware of threads, more to that here https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node303.htm Unfortunately the OpenMPI version we currently run does not allow MPI_THREAD_MULTIPLE.