I'm using cAdvisor and Prometheus to monitor docker containers. I start the application using a docker-compose.yml
file.
In the cAdvisor docs, I read that the --enable_metrics
and --disable_metrics
flags can be used to select only a subset of metrics to monitor.
However, as soon as I supply any of these flags, cAdvisor appears to only monitor itself.
The --ignore_containers=cadvisor flag isn't working either, so I must be doing something wrong?
This is my docker-compose file:
version: '3.2'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- 9090:9090
command:
- --config.file=/etc/prometheus/prometheus.yml
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
depends_on:
- cadvisor
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: advisor
# also tried:
# command: "--enable_metrics=memory"
# command: --enable_metrics=memory
# and more...
command:
- --disable_metrics=accelerator,advtcp,app,cpu,cpuLoad,cpu_topology,cpuset,disk,diskIO,hugetlb,memory_numa,network,oom_event,percpu,perf_event,process,referenced_memory,resctrl,sched,tcp,udp
- --enable_metrics=cpuLoad,memory,network
- --ignore_containers=cadvisor
ports:
- 8080:8080
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:to
... containers ...
EDIT:
I tried what user e0031374 suggested, and fixed the cAdvisor version to v0.47.1. However, I noticed that as soon as I provide any command, the container exits in an unhealthy state. Without the command everything runs fine.
For example, when I add
command:
- "--version"
docker ps -a
shows:
5c4092450466 gcr.io/cadvisor/cadvisor "/usr/bin/cadvisor -…" 24 seconds ago Exited (0) 16 seconds ago
docker container inspect 5c4092450466
shows:
...
Created": "2023-04-20T15:57:15.491743018Z",
"Path": "/usr/bin/cadvisor",
"Args": [
"-logtostderr",
"--version"
],
"State": {
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 0,
"Error": "",
"StartedAt": "2023-04-20T15:57:22.532921383Z",
"FinishedAt": "2023-04-20T15:57:22.726984392Z",
"Health": {
"Status": "unhealthy",
"FailingStreak": 0,
"Log": []
}
},
...
What am I missing here? Thanks!
try running cadvisor with runtime option command: "--version"
to check the version of the cadvisor image you are using
cadvisor may be below v0.41.0 even though latest version is specified. if so you may need to manually specify a later version like image: gcr.io/cadvisor/cadvisor:v0.47.0