I have a prometheus metric file_count_RAN_RRC{folder="RAN_RRC", instance="jobserver:9669", job="files_monitor"}
which gives me the count of files that was created in a folder in one hour. So in a day I get 24 distinct values of file_count_RAN_RRC
even though the endpoint is scraped every two minutes.
I want to set up an alert using Alert Manager to send a slack notification if the total count of files is less than 1000 in the last day.
How should my /etc/prometheus/alert.rules.yml
look like for this task?
In Prometheus when passing range selector you can specify resolution, like this [5m:1m]
. In this example, range selector will return results for last 5 minutes with data points every 1 minute.
So for your task you can use sum_over_time(file_count_RAN_RRC[24h:1h])
to get total count of files changed.
Alert rule in this case will look like this:
- alert: FilesAreNotUpdated
expr: sum_over_time(file_count_RAN_RRC[24h:1h]) < 1000
labels:
severity: page
annotations:
summary: Folder {{ $labels.folder }} is not updated.