prometheuspromqlprometheus-node-exportercadvisor

Obtaining CPU Usage (sorted by host and name) with Prometheus and Cadvisor in Grafana


I'd like to thank @markalex for his very intuitive and informative answer regarding my previous question as I thought it might solve my issue - and it does the way the question was initially stated - but for simplicity, I left out the fact that I have two different hosts and was trying to also separate by their cores in the same query. I wanted to add this additional info in the original question, but was then made aware that it's a bad idea to do it this way, and not only does a disservice to anyone looking for a clear answer to the question, but also to the person that answered that question.

Getting to the question, if I wanted to make it so that same query (from the above link) would be separating cpu usage not only by name, but also by host (I have two hosts), is that even possible? Here's what I tried, but it doesn't work:

sum(rate(container_cpu_usage_seconds_total{host=~".+",name=~".+"}[$__rate_interval])) by (host,name) / on() group_left() sum(machine_cpu_cores{host=~".+"})

Admittedly, I didn't think this query was going to work as it just ends up dividing by the combined amount of cores of both my hosts, but I had hoped that maybe there was an intuitive way to separate them and match by the left side when performing that division while using only 1 query. If all I can do is instead use two queries and specifically name the hosts, I'll accept that, but I'm hoping that there's something I missed. After all, it's not like I know exactly what I'm doing yet.

Edit:

I may have found a way to perform this function with the below query:

sum(rate(container_cpu_usage_seconds_total{host=~".+",name=~".+"}[$__rate_interval])) by (host,name) / on(host) group_left() sum(machine_cpu_cores{host=~".+"}) by (host)

Please let me know if this is feasible, and this time, I added the edit before someone attempted to answer, so hopefully this is ok?


Solution

  • My above edit works for me, so I'll post this as the answer:

    sum(rate(container_cpu_usage_seconds_total{host=~".+",name=~".+"}[$__rate_interval])) by (host,name) / on(host) group_left() sum(machine_cpu_cores{host=~".+"}) by (host)