postgresqlrrdtooltimescaledb

Data retention in timescaledb


Trying to wrap my head around timescaledb, but my google-fu is failing me. Most likely because I'm not searching for the correct term.

With RRD tool, old data can be stored as averages, reducing the amount of data being stored.

I can't seem to find out how to do this with timescaledb. I'd like 5 minute resolution for 90 days, but after that, it's pointless to keep all those data points, and I'd like to reduce it to 30 or 60 minute averages for a couple years, then maybe daily averages after that.

Is this something that I can set in the database itself, or is this something I would have to implement in a housekeeping job?


Solution

  • We had the exact same question half a year ago.

    The term "Data Retention" is also used by the timescaledb team. It is currently implemented using drop_chunks policies (see their doc here). It's a Enterprise feature but IMHO not (yet) as useful as it could/should be (and it surely does not do what you are looking for).

    Let me explain: probably the easiest approach for down-sampling your data are Continuous Aggregates (their doc here). You can quite easily aggregate virtually any numeric value to whatever resolution you desire. However, Continuous Aggregates will be affected by the deletions of the drop_chunks, too. Your data is gone.

    One workaround would be to create other Hypertables instead. Then, create your own background workers copying the data from the original, hi-res table to these new lo-res Hypertables. For housekeeping, either use the Data Retention Enterprise feature or create your own background workers.