Within an infinite loop, I am listening 100+ file descriptors using select. If fd has some packets ready to be read, I notify the packet processor thread assigned to this file descriptor and I don't set the bit for this file descriptor for the next round until I receive a notification from data processor thread saying it is done. I wonder how inefficient my code would be if I won't calculate the max. fd for select everytime I clear/set a file descriptor from the set. I am expecting file descriptors to be nearly contiguous, data arrival rate to be a few thousands bytes every second for each fd.
You should really use poll
instead of select
. Both are standard, but poll
is easier to use, does not place a limit on the number of file descriptors you can check (whereas select
limits you to the compile-time constant FD_SETSIZE
), and more efficient. If you do use select
, you can always pass FD_SETSIZE
for the first argument, but this will of course give worst-case performance since the kernel has to scan the whole fd_set
; passing the actual max+1 allows a shorter search, but still not as efficient as the array passed to poll
.
For what it's worth, these days it seems stylish to use the nonstandard Linux epoll
or whatever the BSD equivalent is. These interfaces may have some advantages if you have a huge number (on the order of tens of thousands) of long-lived (at least several round trips) connections, but otherwise performance will not be noticably better (and, at the lower end, may be worse), and these interfaces are of course non-portable, and in my opinion, harder to use correctly than the plain, portable poll
.