linuxvmwarevirtual-machinesolaris-10

select() inside infinite loop uses significantly more CPU on RHEL 4.8 virtual machine than on a Solaris 10 machine


I have a daemon app written in C and is currently running with no known issues on a Solaris 10 machine. I am in the process of porting it over to Linux. I have had to make minimal changes. During testing it passes all test cases. There are no issues with its functionality. However, when I view its CPU usage when 'idle' on my Solaris machine it is using around .03% CPU. On the Virtual Machine running Red Hat Enterprise Linux 4.8 that same process uses all available CPU (usually somewhere in the 90%+ range).

My first thought was that something must be wrong with the event loop. The event loop is an infinite loop (while(1)) with a call to select(). The timeval is setup so that timeval.tv_sec = 0 and timeval.tv_usec = 1000. This seems reasonable enough for what the process is doing. As a test I bumped the timeval.tv_sec to 1. Even after doing that I saw the same issue.

Is there something I am missing about how select works on Linux vs. Unix? Or does it work differently with and OS running on a Virtual Machine? Or maybe there is something else I am missing entirely?

One more thing I am not sure which version of vmware server is being used. It was just updated about a month ago though.


Solution

  • I believe that Linux returns the remaining time by writing it into the time parameter of the select() call and Solaris does not. That means that a programmer who isn't aware of the POSIX spec might not reset the time parameter between calls to select.

    This would result in the first call having 1000 usec timeout and all other calls using 0 usec timeout.