pythonmultithreadingpoolgeventgreenlets

Why Use gevent.joinall() Instead of pool.imap_unordered() to Run Greenlets?


The title says it all. It seems better and faster to use one of the methods belonging to gevent.Pool to run greenlets in parallel (sort-of) in a pool, as opposed to gevent.joinall(). What are the pros and cons of each approach?


Solution

  • I think the key difference is not raw performance but instead performance management. When you use gevent.joinall() you have to do your own management of how many greenlets exist at once. The naive implementation would create as many as might be needed by the request for the computation.

    On the other hand gevent.Pool can easily be configured to cap how many are running at once and thus protect against running your application out of resources.

    As usual, it's tradeoffs. Your pool may run slower because it potentially won't allow as many greenlets to run as would a naive implementation using gevent.joinall(), however, you are less likely to run your application out of resources (and cascade into other errors).

    Ultimately you have to answer questions like this: Are you likely to get too large of requests? Do you have plenty of resources to draw from? Is raw peak performance more important than average reliability?