I have a function call in python which uses a package method that is non-thread safe (The package writes to three temporary files which have the same name). As the data needed to pass into this method are large and I have numerous input sets, I am approaching this from a distributed perspective, using the MPI4PY library, such that each rank handles a different group of input data at any given time. My problem is that when mapping calls to this function through MPI there are occasions where multiple ranks try to access the function call at once leading to a thread-race condition where data are being overwritten by two calls to the function at once (And then causing the script to error out).
Since the package method is non-thread safe, my question is how would I perform a mutex-style lock on the function such that only one MPI rank is allowed to work inside the function at a time:
For example:
def mpi_call(args):
comm = MPI.COMM_WORLD
# Need to mutex lock here
non_threadsafe_method(args)
# Need to unlock here
return true
I have tried to use the Barrier() method here but this leads to a program deadlock since there are only a limited number of ranks that actually enter the method (Not all ranks enter the function that calls the package method).
I would like to know the best way to handle a mutex-style lock for this type of function.
Thanks!
Try a filesystem lock. It is essential that your conflict is between processes rather than threads (long story). Using fasteners
library your code would look like this:
import fasteners
def mpi_call(args):
comm = MPI.COMM_WORLD
# Need to mutex lock here
with fasteners.InterProcessLock('/tmp/tmp_lock_file'):
non_threadsafe_method(args)
# Need to unlock here
return true
See more here: https://fasteners.readthedocs.io/en/latest/examples.html#interprocess-locks