pythonmultithreadingpython-multiprocessingpython-multithreading

comparative analysis of multithreading and multiprocessing


i would like to compare to each other python's multithreading and multiprocessing, here is example of multithreading :

import numpy as np
            import threading
            import time
            def calc_squares(numbers):
                for n in numbers:
                    time.sleep(3)
                    print(f'Square of {n} is {n**2}\n')
            def calc_cubes(numbers):
                for n in numbers:
                    time.sleep(3)
                    print(f'cube of {n} is {n**3}\n')
            arr =[2,3,4,5]
            start =time.time()
            th1 =threading.Thread(target=calc_squares,args=(arr,))
            th2 =threading.Thread(target=calc_cubes,args=(arr,))
            th1.start()
            th2.start()
            th1.join()
            th2.join()
            
            # calc_squares(arr)
            # calc_cubes(arr)
            print(f'time taken is {time.time()-start}')

this algorithm takes apprximately 12 second,no let us consider multiprocessing :

import numpy as np
            import threading
            import time
            import multiprocessing
            def calc_squares(numbers):
                for n in numbers:
                    time.sleep(3)
                    print(f'Square of {n} is {n**2}\n')
            def calc_cubes(numbers):
                for n in numbers:
                    time.sleep(3)
                    print(f'cube of {n} is {n**3}\n')
            if __name__=="__main__":
                arr =[2,3,4,5]
                start =time.time()
                th1 =multiprocessing.Process(target=calc_squares,args=(arr,))
                th2 =multiprocessing.Process(target=calc_cubes,args=(arr,))
                th1.start()
                th2.start()
                th1.join()
                th2.join()
            
                # calc_squares(arr)
                # calc_cubes(arr)
                print(f'time taken is {time.time()-start}')

result is same : Square of 2 is 4

cube of 2 is 8

Square of 3 is 9

cube of 3 is 27

Square of 4 is 16

cube of 4 is 64

Square of 5 is 25

cube of 5 is 125

time taken is 12.206198692321777

so my question : for such task is there any difference multithreading is used or multiprocessing?


Solution

  • You have used time.sleep(3) in the functions in both the cases. So, you won't find any apparent difference.

    The threading in python is not a real multiprocessing. i.e it creates an illusion that different processes are done parallelly by different cores, but actually not. (Sorry if the term 'illusion' is inappropriate. you can suggest a better word so that I can edit the answer). This is where multiprocessing comes in.

    multiprocessing is the real time multiprocessing using different cores of the cpu for different processes.

    A better approach for using multiprocessing:

    import multiprocessing as mp
    import time
    
    # Always use this if __name__ == "__main__":
    if __name__ == "__main__": 
    
        arr = [2, 3, 4, 5]
        funcs = [calc_squares, calc_cubes]
        start_time = time.time()
    
        with mp.Pool() as pool:
    
            # Submit tasks and store AsyncResult objects
            async_results = [pool.apply_async(func, (arr,)) for func in funcs]
            
            # Retrieve results (blocks until done)
            results = [res.get() for res in async_results]
    
        print(results)  # Output: [[4, 9, 16, 25], [8, 27, 64, 125]]
        
        print(f'time taken is {time.time()-start_time}')