I want to access lables of the metric which matches the expression to inform the team about which exact queue exceeds the limit:
An example of the metric
job_duration_bucket{resource_server="rs1",queue="default",connection="database",le="15"} 445
job_duration_bucket{resource_server="rs1",queue="default",connection="database",le="20"} 567
job_duration_bucket{resource_server="rs1",queue="high",connection="database",le="15"} 123
job_duration_bucket{resource_server="rs1",queue="high",connection="database",le="20"} 124
And my alerting rule
groups:
- name: alert.rules
rules:
- alert: JobDurationExceedance
expr: job_duration_bucket{queue="high",le="20"} > job_duration_bucket{queue="high",le="15"}
for: 1s
labels:
severity: critical
annotations:
{ % raw % }summary: {{ $labels.resource_server }} high prio job duration exceed limit
description: "On the {{ $labels.resource_server }} {{ $labels.queue }} prio {{ $labels.connection }} queue currently jobs exists, which are needing more than 15 minutes or more from being queued to completion."{ % endraw % }
The alert message should look like
On the rs1 high prio database queue currently jobs exist...
If there is any way to achieve this I will be very thankful for advice!
I only found a way to add job labels and not metric labels
You have to quote the summary as well, then it should work fine.