I want to be able to specify all my rules for, say prometheus-blackbox-exporter
so have added this to a rules-mine.yaml
and deployed with
helm upgrade --install -n monitoring blackbox -f values.yaml -f rules-mine.yaml .
I cannot see any rules listed in http://localhost:9090/rules and nothing seems to be evaluated as no alerts.... I need to do everything as IaC and deploy through terraform in an automated fashion.
The rules-mine.yaml
file contains:
prometheusRule:
enabled: true
namespace: monitoring
additionalLabels:
team: foxtrot_blackbox
environment: production
cluster: cluster
namespace: namespace_x
namespace: "monitoring"
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 0m
labels:
severity: critical
annotations:
summary: Blackbox probe failed (instance {{`{{`}} $labels.instance {{`}}`}})
description: "Probe failed\n VALUE = {{`{{`}} $value {{`}}`}}"
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 1m
labels:
severity: warning
annotations:
summary: Blackbox slow probe (instance {{`{{`}} $labels.instance {{`}}`}})
description: "Blackbox probe took more than 1s to complete\n VALUE = {{`{{`}} $value {{`}}`}}"
Thanks for your help....
A colleague found that this is entirely possible. It seemed to have something to do with the quoting that was used in the original implementation. The following is now in use and working so posting here in the hope it will be useful for others.
In summary,
{{`{{`}} $labels.instance {{`}}`}}
== BAD{{`{{$labels.instance}}`}}
== GOODprometheusRule:
enabled: true
additionalLabels:
client: ${client_id}
cluster: ${cluster}
environment: ${environment}
grafana: ${grafana_url}
rules:
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: Blackbox probe failed for {{`{{$labels.instance}}`}}
description: Probe failed VALUE = {{`{{$value}}`}}
dashboard_url: https://${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{`{{$labels.instance}}`}}
runbook_url: ${wiki_url}/BlackboxProbeFailed
- alert: BlackboxSlowProbe
expr: avg_over_time(probe_duration_seconds[1m]) > 1
for: 2m
labels:
severity: warning
annotations:
summary: Blackbox slow probe for {{`{{$labels.instance}}`}}
description: Blackbox probe took more than 1s to complete VALUE = {{`{{$value|humanizeDuration}}`}}
dashboard_url: https://${grafana_url}/d/blackbox/blackbox-exporter?var-instance={{`{{$labels.instance}}`}}
runbook_url: ${wiki_url}/BlackboxSlowProbe
Please ignore any missing variables, etc.