pythondistributed-computingconcurrent-processing

Parallelization of Python code on different machines on different networks


I’m looking to use parallelized code across two computers on different networks to execute a batch of tasks, but am not sure how to do so in Python.

Suppose I have two computers, Computer A and Computer B on two different networks, and I have a batch of 100 tasks to be accomplished. Naively, I could assign Computer A and Computer B to each do 50 tasks, but if Computer A finishes its tasks before Computer B, I would like Computer A to take on some of Computer B’s remaining tasks. Both computers should return the results of their tasks to my local machine. How can this be done?


Solution

  • Luckily, python has an excellent library "Celery" which let's you achieve exactly what you want. It's a well documented library and has a large and diverse community of users and contributors. You just need to setup a broker (or queue) and configure celery.

    There are lots of features in Celery that you can use as per your requirement - Monitoring/Scheduling jobs/Celery canvas to name a few.

    https://docs.celeryproject.org/en/stable/getting-started/introduction.html https://medium.com/swlh/python-developers-celery-is-a-must-learn-technology-heres-how-to-get-started-578f5d63fab3