docker-swarmprometheuscadvisor

Docker Swarm - Prometheus cannot access Cadvisor: dial tcp 10.0.0.50:8090: connect: connection refused


On my Windows 10 Pro I have a complete Docker Swarm environment. Part of the Docker Swarm stack are Prometheus and cAdvisor. Step by step I will build the monitoring tools and then deploy the monitoring to a Cloud solution.

In the Docker Swarm stack I can run Prometheus and Cadvisor, but Prometheus cannot connect to cAdvisor. I get the message:

Get http://cadvisor:8090/metrics: dial tcp 10.0.0.50:8090: connect: connection refused

How can I get Prometheus access cadvisor?

In my browser I can perform a 'localhost:8090/metrics' and get all metrics. So, the cAdvisor runs for sure.

I have one stack file that creates the network (devhome_default). In my second stack I refer to this network.

UPDATE: one way to solve this is to use the IP-address: $ ipconfig Using that address in my prometheus.config works fine. But this makes the target hard-wired and not maintainable.

The stack / dockercompose file is:

version: '3'
services:
  cadvisor:
    image: google/cadvisor
    networks:
      - geosolutionsnet
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /:/rootfs
      - /var/run:/var/run
      - /sys:/sys
      - /var/lib/docker/:/var/lib/docker
    ports:
      - 8090:8080
    deploy:
      mode: global
      resources:
        limits:
          cpus: '0.10'
          memory: 128M
        reservations:
          cpus: '0.10'
          memory: 64M

  prometheus:
    image: prom/prometheus:v2.8.0
    ports:
      - "9090:9090"
    networks:
      - geosolutionsnet
    volumes:
      - //k/data/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
    deploy:
      mode: replicated
      replicas: 1

networks:
  geosolutionsnet:
    external:
          name: devhome_default

The prometheus config file is:

global:
  scrape_interval:     15s
  evaluation_interval: 15s

rule_files:
  #- "alert.rules_nodes"
  #- "alert.rules_tasks"
  #- "alert.rules_service-groups"

scrape_configs:
  - job_name: 'prometheus'
    dns_sd_configs:
    - names:
      - 'tasks.prometheus'
      type: 'A'
      port: 9090
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8090']
        labels:
          alias: "cadvisor"

Alternatively, I tried also for cAdvisor:

- job_name: 'cadvisor'
  dns_sd_configs:
  - names:
    - 'tasks.cadvisor'
    type: 'A'
    port: 8090

And also:

- job_name: 'cadvisor'
  static_configs:
    - targets: ['localhost:8090']

Solution

  • In different Cloud environments the below 'dns' solution works like charm. Because a 'real' Cloud environment is the target environment for our Docker containers, the standard 'de facto' solution suffices.

    So, this works well in Cloud environments:

    - job_name: 'cadvisor'
      dns_sd_configs:
      - names:
        - 'tasks.cadvisor'
        type: 'A'
        port: 8090