How CPU Steal Time can be measured by Prometheus Node-exporter CPU metrics?
We have an OpenStack/KVM environment and we want to measure/Know how much CPU steal happens (Percent) in our Computes/Hosts/Hypervisors.
Node exporter exposes metric node_cpu_seconds_total
with mode
steal for counter of CPU time being stolen.
This metric is exposed by default as part of cpu
collector.
Expression like
sum by (instance, cpu) (rate(node_cpu_seconds_total{mode="steal"} [2m]))
can show you how much (in percents) CPU steal happened on every CPU of every machine.