I'm using Pushgateway with Prometheus and everything is OK but after a couple of weeks Pushgateway collapses ... giving it a look there are tons of metrics that are not used anymore and delete them manually is practically impossible ... so ->
There is a way to expire Pushgateway metrics with a TTL or some other retention settings like by size or by time ? ... or maybe both ?
NOTE: I read at the mailing list of Prometheus a lot of people requiring something like this from one year ago or more ... and the only answer so far is -> this is not the Promethean way to do it ... really ? ... common, if this is a real pain for a lot of people maybe there should be a better way (even if it's not the Promethean way)
Supposing you want to remove the metrics related to a group when they become too old (for a given definition of too old), you have the metric push_time_seconds
which is automatically defined by the pushgateway.
push_time_seconds{instance="foo",job="bar",try="longtime"} 1.598280005888635e+09
With this information, you can write a script that request/grab this metric and identify the old group of data ({instance="foo",job="bar",try="longtime"}
) with the value. The API let you remove of metrics related to your old data:
curl -X DELETE http://pushgateway:9091/metrics/job/bar/instance/foo/try/longtime
This can be done in a few lines of bash script or python.