I want to measure the CPU "starvation" of the Linux host and I'm trying to find a way to measure it. It seems that one of the ways to measure it is to read the procs_running
field value from the /proc/stat
file. And if this value is more than CPUs online that means that the system has more tasks ready to run than have CPUs to run them.
However, if I poll this value from the /proc/stat
file for example, once a second, I can miss moments when procs_running
was higher than the number of CPUs and thus miss events when the system was not been able to run task due to shortage of the CPU resources.
Is there any way to reliably monitor such events in a Linux system?
You might want the Pressure Stall Information "some CPU" statistic. It's designed to answer a question that's at least similar to what you're looking for.
"How often did a CPU (i.e. a hardware thread) have two or more tasks ready to run?"
https://lwn.net/Articles/753840/
If it's built into your kernel you'll have a /proc/pressure directory. There's some documentation here and "Pressure Stall Information" is fairly Googleable