I recently trying to make multi-threaded echo chat server using epoll. (for studying purpose) I came up with two approaches.
Architecture 1. Create one epoll instance (or epollfd) and share it among threads. Since each thread has same epoll instance, all the threads shares all the sockets.
Architecture 2. Create threads, and make an epoll instance per thread, and they share a listen socket. Each thread will be managing their own set of clients that are exclusive to other threads.
For a chat server, I am compelled to implement it with method 1, because I won't be able to predict which client will be most active, so fairly distributing clients could be difficult in some cases.
I set the listening socket with EPOLLET (edge triggered mode), once event happens, only one thread will wake up and keep accept() until having EAGAIN. Same with the clients, it is set with EPOLLET and EPOLLONESHOT, so every time client sends a message, no more than 2 threads will deal with the same message, avoiding race condition.
It seems to me that I was able to successfully build echo chat server program without any major flaws. I think using edge triggered mode can efficiently distribute accept() across multiple threads, because it wakes up only one thread at a time.
I heard EPOLLEXCLUSIVE is a thing because of 'Thundering herd' problem, it seems to me if I use edge triggered mode will balance accept() load properly among multiple threads. And even if it has Thundering herd problem. It is not that detrimental.
Thus the question: What's the point of EPOLLEXCLUSIVE flag if programmers already can balance accept() loads properly using edge triggered mode? Is there any pitfall I missed? Also, why would they need something like architecture 2? (or architecture using multiple epoll instances)
Sorry for the bad English in advance (I'm not native to English)
TL;DR: I want to know in which circumstances EPOLLEXCLUSIVE flag could be useful. Along with an architecture with multiple epoll instances with multiple threads. How could this be useful as well?
What's the point of EPOLLEXCLUSIVE flag if programmers already can balance accept() loads properly using edge triggered mode?
See below. Note, too, that epoll()
is very hard to use correctly, at least in multi-threading and multi-processing scenarios. It, like poll()
and select()
, is better suited for multiplexing work by a single thread.
Is there any pitfall I missed?
Yes. Epoll does not necessarily signal for each incoming connection. In edge-triggered mode, it signals upon the transition from zero pending connections to at least one. Therefore, it may occasionally be the case that two inbound connections arrive so close together that the thread awakened to handle handle the first does not accept()
it before the second becomes pending as well. Then,
accept()
ing connections until no more are available -- the recommendation of the epoll docs -- then only that thread will accept any connections until the pending connection backlog falls to zero. Which will take longer than it needs to do because only one thread is servicing all the connections. Other threads may idle despite there being work available.One solution to this would be to use level-triggered instead of edge-triggered mode. That will leave threads sometimes waking only to discover that there is no work to do, but how much of a problem is that really? You could even use that as a signal that helps dynamically manage the number of threads running in your service. On the other hand, adding EPOLLEXCLUSIVE
to this mix will solve the problem of unneeded wakeup, without incurring the problems described above.
But if it is only the listening socket that is registered with epoll then an easier solution would be to not use epoll at all. If all your threads simply block on accept()
(for which the listening socket must be in default, blocking mode) instead of on epoll_wait()
then there will be no room for confusion. Each incoming connection will be served right away if there are any threads available, or as soon as one becomes available otherwise.
Also, why would they need something like architecture 2? (or architecture using multiple epoll instances)
This architecture could make sense for multiple worker processes, for which architecture 1 is difficult to implement. At least, if epoll is used at all. I'm not very keen on mixing epoll into matters of sharing resources among threads and / or processes in the first place.