[SOLVED] Why Prometheus is not suitable for long-term storage?

Why Prometheus is not suitable for long-term storage?

I am considering to use Prometheus as a time-series database to store data for long periods of time (months or maybe even over a year).

However, I read in few places that Prometheus is not suitable for long-term storage and other TSDB would be a better solution in that case. But why exactly is it not suitable and what's the cons of using it as a long-term storage?

The official docs mention:

Prometheus's local storage is not intended to be durable long-term storage; external solutions offer extended retention and data durability.

But what "extended retention and data durability" means exactly and why is it not achievable with Prometheus?

Solution

It is a design decision and it has mainly to do with the scope of a project/tool. The original authors, in the context of their use case at SoundCloud, decided not to build a distributed data storage layer but keep things simple.

In other words: Prometheus will fill up a disk but doesn't shard or replicate the data for you. Now, if you have many different environments you want to monitor, creating hundreds of thousands of timeseries and gazillion of metrics, that won't scale (local disks are to small and an NFS-based solution might now be what you want either). So, there are different solutions out there, allowing you to federate and/or deduplicate metrics from different environments.

The important thing to remember here is that it is not a shortcoming of Prometheus but a conscious decision to focus on one thing and do it really well and over time developing APIs (remote_write and remote_read) that enable others to build systems that address the distributed/at scale use case.