cmultithreadingpipeposix-select

how important is setting max fd on select?


Within an infinite loop, I am listening 100+ file descriptors using select. If fd has some packets ready to be read, I notify the packet processor thread assigned to this file descriptor and I don't set the bit for this file descriptor for the next round until I receive a notification from data processor thread saying it is done. I wonder how inefficient my code would be if I won't calculate the max. fd for select everytime I clear/set a file descriptor from the set. I am expecting file descriptors to be nearly contiguous, data arrival rate to be a few thousands bytes every second for each fd.


Solution

  • You should really use poll instead of select. Both are standard, but poll is easier to use, does not place a limit on the number of file descriptors you can check (whereas select limits you to the compile-time constant FD_SETSIZE), and more efficient. If you do use select, you can always pass FD_SETSIZE for the first argument, but this will of course give worst-case performance since the kernel has to scan the whole fd_set; passing the actual max+1 allows a shorter search, but still not as efficient as the array passed to poll.

    For what it's worth, these days it seems stylish to use the nonstandard Linux epoll or whatever the BSD equivalent is. These interfaces may have some advantages if you have a huge number (on the order of tens of thousands) of long-lived (at least several round trips) connections, but otherwise performance will not be noticably better (and, at the lower end, may be worse), and these interfaces are of course non-portable, and in my opinion, harder to use correctly than the plain, portable poll.