i would like to compare to each other python's multithreading and multiprocessing, here is example of multithreading :
import numpy as np
import threading
import time
def calc_squares(numbers):
for n in numbers:
time.sleep(3)
print(f'Square of {n} is {n**2}\n')
def calc_cubes(numbers):
for n in numbers:
time.sleep(3)
print(f'cube of {n} is {n**3}\n')
arr =[2,3,4,5]
start =time.time()
th1 =threading.Thread(target=calc_squares,args=(arr,))
th2 =threading.Thread(target=calc_cubes,args=(arr,))
th1.start()
th2.start()
th1.join()
th2.join()
# calc_squares(arr)
# calc_cubes(arr)
print(f'time taken is {time.time()-start}')
this algorithm takes apprximately 12 second,no let us consider multiprocessing :
import numpy as np
import threading
import time
import multiprocessing
def calc_squares(numbers):
for n in numbers:
time.sleep(3)
print(f'Square of {n} is {n**2}\n')
def calc_cubes(numbers):
for n in numbers:
time.sleep(3)
print(f'cube of {n} is {n**3}\n')
if __name__=="__main__":
arr =[2,3,4,5]
start =time.time()
th1 =multiprocessing.Process(target=calc_squares,args=(arr,))
th2 =multiprocessing.Process(target=calc_cubes,args=(arr,))
th1.start()
th2.start()
th1.join()
th2.join()
# calc_squares(arr)
# calc_cubes(arr)
print(f'time taken is {time.time()-start}')
result is same : Square of 2 is 4
cube of 2 is 8
Square of 3 is 9
cube of 3 is 27
Square of 4 is 16
cube of 4 is 64
Square of 5 is 25
cube of 5 is 125
time taken is 12.206198692321777
so my question : for such task is there any difference multithreading is used or multiprocessing?
You have used time.sleep(3)
in the functions in both the cases. So, you won't find any apparent difference.
The threading
in python is not a real multiprocessing
. i.e it creates an illusion that different processes are done parallelly by different cores, but actually not. (Sorry if the term 'illusion' is inappropriate. you can suggest a better word so that I can edit the answer). This is where multiprocessing
comes in.
multiprocessing
is the real time multiprocessing using different cores of the cpu for different processes.
A better approach for using multiprocessing:
import multiprocessing as mp
import time
# Always use this if __name__ == "__main__":
if __name__ == "__main__":
arr = [2, 3, 4, 5]
funcs = [calc_squares, calc_cubes]
start_time = time.time()
with mp.Pool() as pool:
# Submit tasks and store AsyncResult objects
async_results = [pool.apply_async(func, (arr,)) for func in funcs]
# Retrieve results (blocks until done)
results = [res.get() for res in async_results]
print(results) # Output: [[4, 9, 16, 25], [8, 27, 64, 125]]
print(f'time taken is {time.time()-start_time}')