prometheuspromqlprometheus-alertmanager

Get file count in the last day


I have a prometheus metric file_count_RAN_RRC{folder="RAN_RRC", instance="jobserver:9669", job="files_monitor"} which gives me the count of files that was created in a folder in one hour. So in a day I get 24 distinct values of file_count_RAN_RRC even though the endpoint is scraped every two minutes.

I want to set up an alert using Alert Manager to send a slack notification if the total count of files is less than 1000 in the last day.

How should my /etc/prometheus/alert.rules.yml look like for this task?


Solution

  • In Prometheus when passing range selector you can specify resolution, like this [5m:1m]. In this example, range selector will return results for last 5 minutes with data points every 1 minute.

    So for your task you can use sum_over_time(file_count_RAN_RRC[24h:1h]) to get total count of files changed.

    Alert rule in this case will look like this:

      - alert: FilesAreNotUpdated
        expr: sum_over_time(file_count_RAN_RRC[24h:1h]) < 1000
        labels:
          severity: page
        annotations:
          summary: Folder {{ $labels.folder }} is not updated.