The only thing my Celery task is doing is making an API request and sending the response back to Redis
queue. What I'd like to achieve is to utilize as many resources as possible by executing tasks in a coroutine-like fashion. This way every time a coroutine hits requests.post()
the context switcher can switch and allocate resources to another coroutine to send one more request and so forth.
As I understand, to achieve this, my worker has to run with a gevent
execution pool:
celery worker --app=worker.app --pool=gevent --concurreny=500
But it doesn't solve the problem on its own. I have found that (probably) for it to work as expected we need monkey patching:
@app.task
def task_make_request(payload)
import gevent.monkey
gevent.monkey.patch_all()
requests.post('url', payload)
The questions:
Gevent
the only execution pool that can be used for this goal?patch_all
make requests.post()
asynchronous so that the context switcher can allocate resources to other coroutines?When you run under the gevent
worker, monkey patching happens almost immediately (see: celery.init), and does not need to be repeated. This will patch the threading and related concurrency modules. You can inspect this if you get creative fishing in the requests
library dynamically at runtime (an exercise left to the reader).
You can also use the eventlet worker, and they have a webscraping example in the Celery repository: here