I'm currently trying to understand mpi4py. I set mpi4py.rc.initialize = False
and mpi4py.rc.finalize = False
because I can't see why we would want initialization and finalization automatically. The default behavior is that MPI.Init()
gets called when MPI is being imported. I think the reason for that is because for each rank a instance of the python interpreter is being run and each of those instances will run the whole script but that's just guessing. In the end, I like to have it explicit.
Now this introduced some problems. I have this code
import numpy as np
import mpi4py
mpi4py.rc.initialize = False # do not initialize MPI automatically
mpi4py.rc.finalize = False # do not finalize MPI automatically
from mpi4py import MPI # import the 'MPI' module
import h5py
class DataGenerator:
def __init__(self, filename, N, comm):
self.comm = comm
self.file = h5py.File(filename, 'w', driver="mpio", comm=comm)
# Create datasets
self.data_ds= self.file.create_dataset("indices", (N,1), dtype='i')
def __del__(self):
self.file.close()
if __name__=='__main__':
MPI.Init()
world = MPI.COMM_WORLD
world_rank = MPI.COMM_WORLD.rank
filename = "test.hdf5"
N = 10
data_gen = DataGenerator(filename, N, comm=world)
MPI.Finalize()
which results in
$ mpiexec -n 4 python test.py
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01559] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01560] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort. [eu-login-04:01557] Local abort after MPI_FINALIZE started completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
-------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was:
Process name: [[15050,1],3] Exit code: 1
--------------------------------------------------------------------------
I am a bit confused as to what's going on here. If I move the MPI.Finalize()
to the end of the destructor, it works fine.
Not that I also use h5py which uses MPI for parallelization. So I have a parallel file IO here. Not that h5py needs to be compiled with MPI support. You can easily do that by setting up a virtual environment and running pip install --no-binary=h5py h5py
.
The way you wrote it, data_gen
lives until the main function returns. But you call MPI.Finalize
within the function. Therefore the destructor runs after finalize. The h5py.File.close
method seems to call MPI.Comm.Barrier
internally. Calling this after finalize is forbidden.
If you want have explicit control, make sure all objects are destroyed before calling MPI.Finalize
. Of course even that may not be enough in case some objects are only destroyed by the garbage collector, not the reference counter.
To avoid this, use context managers instead of destructors.
class DataGenerator:
def __init__(self, filename, N, comm):
self.comm = comm
self.file = h5py.File(filename, 'w', driver="mpio", comm=comm)
# Create datasets
self.data_ds= self.file.create_dataset("indices", (N,1), dtype='i')
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.file.close()
if __name__=='__main__':
MPI.Init()
world = MPI.COMM_WORLD
world_rank = MPI.COMM_WORLD.rank
filename = "test.hdf5"
N = 10
with DataGenerator(filename, N, comm=world) as data_gen:
pass
MPI.Finalize()