pythonmultithreadingmultiprocessing

race condition issue in python


i want to consider following simple code :

from multiprocessing import Process,Value
import  time
def add_10_times(number):
    for  i in  range(10):
        time.sleep(0.01)
        number.value+=1
if __name__ =="__main__":
    shared_number =Value('i',0)
    process1 = Process(target=add_10_times, args=(shared_number,))
    process2 = Process(target=add_10_times, args=(shared_number,))
    process1.start()
    process2.start()
    process1.join()
    process2.join()
    print(f'final value is {shared_number.value}')

based on the following site : multiprocessing

and based on well known problem called as Race condition, value should be less then 20 right? but in most cases it gives me value 20 :

final value is 20

sometimes it jumps to 16 or 18 , but in most case it gives me 20, so where is Race condition? and under what case it occurs?


Solution

  • So - you already met the race condition - even with a value as low as a count to 20 -

    Keep in mind that race-conditions are not deterministic (Although one can craft special scenarios in test cases so that they'd be triggered every time, unless the mitigating code that is under test is working)

    The problem is that for code that is due to be deterministic, even a count to several millions, instead of "20" for example, would still need to be exact.

    WHen using multiprocessing, in your code, for example, the "vulnerability window" is just between reading the value to a local object, performing the addition, and writing it back - that would take micro-seconds, and compared to the overhead of creating new processes, and the time.sleep call you use, I am actually surprised you've hit it in a few attempts counting up to "20" alone.

    Just in case it is not clear, the .value += 1 operation is actually three operations: the value is read from whatever mechanism Python's multiprocessing uses to keep track of the real object, and converted to an int object inside the subprocess. The addition is performed, and the new value is now written down to the .value mechanism. So, if in the meantime another process reads the value, it will read it before the addition.

    Would it be possible to create a multiprocessing.Value which would implement += atomic operation? Yes it would, but that would require a lot of code more, and if done for that, it would have to be done for all augmented assignment operators - and them, if people would get used to it being 'safe', they might just as well add one more step in the code, and having the safety go away. So I guess the authors opted for the simplicity, and that people writing concurrent programs would educate themselves in how this sort of thing works (which you are doing by asking this question) .

    CHange your count to "100_000" instead of 20, and you will likely see it every time. Include the sleep inside the dangerous window, between reading the value and writing it back, and you will see it every time.

    And finally, use a lock encompassing both the read, increment and write, to see it go away, and everything work smoothly:

    Script tuned so that you will see the race condition happening almost every single addition - up to only allowing ~0.5% of sums in both processes to proceed:

    from multiprocessing import Process,Value
    from random import random
    
    import  time
    def add_10_times(number):
        for  i in  range(5_000):
            value = number.value
            time.sleep(random() / 300)
            value += 1
            number.value = value
    
    if __name__ =="__main__":
        shared_number =Value('i',0)
        process1 = Process(target=add_10_times, args=(shared_number,))
        process2 = Process(target=add_10_times, args=(shared_number,))
        process1.start()
        process2.start()
        process1.join()
        process2.join()
        print(f'final value is {shared_number.value}')
    
    
    

    Correct script with a multiprocessing.Lock encompassing the critical region, so that every run will yield the correct number of summations, by both processes:

    from multiprocessing import Process,Value, Lock
    from random import random
    
    import  time
    def add_10_times(number, lock):
        for  i in  range(5_000):
            with lock:
                value = number.value
                time.sleep(random() / 300)
                value += 1
                number.value = value
    
    if __name__ =="__main__":
        shared_number =Value('i',0)
        lock = Lock()
        process1 = Process(target=add_10_times, args=(shared_number, lock))
        process2 = Process(target=add_10_times, args=(shared_number, lock))
        process1.start()
        process2.start()
        process1.join()
        process2.join()
        print(f'final value is {shared_number.value}')
    
    

    Note that the use of locks is so much expected when writing this kind of code, that the .Value objects even have a lock of their own built-in by default - which could be used in place of explicitly passing a stand alone multiprocessing.Lock like this example does.