I am trying to share a lock among processes. I understand that the way to share a lock is to pass it as an argument to the target function. However I found that even the approach below is working. I could not understand the way the processes are sharing this lock. Could anyone please explain?
import multiprocessing as mp
import time
class SampleClass:
def __init__(self):
self.lock = mp.Lock()
self.jobs = []
self.total_jobs = 10
def test_run(self):
for i in range(self.total_jobs):
p = mp.Process(target=self.run_job, args=(i,))
p.start()
self.jobs.append(p)
for p in self.jobs:
p.join()
def run_job(self, i):
with self.lock:
print('Sleeping in process {}'.format(i))
time.sleep(5)
if __name__ == '__main__':
t = SampleClass()
t.test_run()
On Windows (which you said you're using), these kinds of things always reduce to details about how multiprocessing
plays with pickle
, because all Python data crossing process boundaries on Windows is implemented by pickling on the sending end (and unpickling on the receiving end).
My best advice is to avoid doing things that raise such questions to begin with ;-) For example, the code you showed blows up on Windows under Python 2, and also blows up under Python 3 if you use a multiprocessing.Pool
method instead of multiprocessing.Process
.
It's not just the lock, simply trying to pickle a bound method (like self.run_job
) blows up in Python 2. Think about it. You're crossing a process boundary, and there isn't an object corresponding to self
on the receiving end. To what object is self.run_job
supposed to be bound on the receiving end?
In Python 3, pickling self.run_job
also pickles a copy of the self
object. So that's the answer: a SampleClass
object corresponding to self
is created by magic on the receiving end. Clear as mud. t
's entire state is pickled, including t.lock
. That's why it "works".
See this for more implementation details:
Why can I pass an instance method to multiprocessing.Process, but not a multiprocessing.Pool?
In the long run, you'll suffer the fewest mysteries if you stick to things that were obviously intended to work: pass module-global callable objects (neither, e.g., instance methods nor local functions), and explicitly pass multiprocessing
data objects (whether an instance of Lock
, Queue
, manager.list
, etc etc).