Many Linux/Unix programming books and tutorials speak about the "Thundering Herd Problem" which happens when multiple threads or forks are blocked on a select()
call waiting for readability of a listening socket. When the connection comes in, all threads and forks are woken up but only one "wins" with a successful call to accept()
. In the meantime, a lot of CPU time is wasted waking up all the threads/forks for no reason.
I noticed a project which provides a "fix" for this problem in the linux kernel, but this is a very old patch.
I think there are two variants; One where each fork does select()
and then accept()
, and one that just does accept()
.
Do modern Unix/Linux kernels still have the Thundering Herd Problem in both these cases or only the "select()
then accept()
" version?
It's there and it's real. See this issue that we are seeing in uwsgi: https://github.com/unbit/uwsgi/issues/2611
If I disable the --thunder-lock option in uwsgi, that means uwsgi won't be using right api/locking mechanism of system. In that case during my peak load I could see lot of context switch and lot of time wasted. Consistent high response time of my application. (I am talking 1 Lac request per min on my server) at this moment.