kubernetesprometheusgrafanaorleans

Prometheus configuration for monitoring Orleans in Kubernetes


I'm trying to get Prometheus functioning with my Orleans silos...

  1. I use this consumer to expose Orleans metrics for Prometheus on port 8082. With a local Prometheus instance and using the grafana.json from the same repository I see that it works.

      _ = builder.AddPrometheusTelemetryConsumerWithSelfServer(port: 8082);
    
  2. Following this guide to install Prometheus on Kubernetes on a different namespace that my silos are deployed.

  3. Following instructions I added the prometheus labels to my orleans deployment yaml:

      spec:
       replicas: 2
       selector:
         matchLabels:
         app: mysilo
       template:
         metadata:
           annotations:
             prometheus.io/scrape: 'true'
             prometheus.io/port: '8082'
           labels:
             app: mysilo
    

My job in prometheus yml:

    - job_name: "orleans"
      kubernetes_sd_configs:
        - role: pod
          namespaces:
            names:
              - orleans
          selectors:
            - role: "pod"
              label: "app=mysilo"

According to the same guide, all the pods metrics get discovered if "the pod metadata is annotated with prometheus.io/scrape and prometheus.io/port annotations.". I assume I don't need any extra installations.

With all this, and port forwarding my prometheus pod, I can see prometheus is working in http://localhost:9090/metrics but no metrics are being shown in my grafana dashboard (again, I could make it work in local machine with only one silo).

When exploring grafana I find that it seems it can't find the instances:

    sum(rate(process_cpu_seconds_total{job=~"orleans", instance=~"()"}[3m])) * 100

The aim is to monitor resources my orleans silos are using (not the pods metrics themselves, but orleans metrics), but I'm missing something :(


Solution

  • Thanks to @BozoJoe's comment I could debug this.

    The problem was that it was trying to scrape ports 30000 and 1111 instead of 8082 like I said before. I could see this thanks to the Prometheus dashboard at localhost:9090/targets

    So I went to prometheus config file and make sure to start scrapping the correct port (also I added some restrictions to the search for name):

      - job_name: "orleans"
        kubernetes_sd_configs:
         - role: pod
           namespaces:
             names:
              - orleans
           selectors:
              - role: "pod"
                label: "app=mysilo"
        relabel_configs:
        - source_labels: [__meta_kubernetes_pod_container_name]
          action: keep   
          regex: 'my-silo-name*'
        - source_labels: [__address__]
          action: replace
          regex: ([^:]+):.*
          replacement: $1:8081
          target_label: __address__