kuberneteskubectlazure-aksamazon-ekskubernetes-statefulset

When should I use StatefulSet?Can I deploy database in StatefulSet?


I heard that StatefulSets are suitable for databases, but StatefulSet will create different PVCs for each pod. If I set the replicas=3, then I get 3 Pods and 3 different PVCs with different data. For database users, they only need one database (consistent view), not 3 different views. So, it's clear we should not use StatefulSet in this situation, but when should we use StatefulSets?


Solution

  • A StatefulSet does three big things differently from a Deployment:

    1. It creates a new PersistentVolumeClaim for each replica;
    2. It gives the pods sequential names, starting with statefulsetname-0; and
    3. It starts the pods in a specific order (ascending numerically).

    This is useful when the database itself knows how to replicate data between different copies of itself. In Elasticsearch, for example, indexes are broken up into shards. There are by default two copies of each shard. If you have five Pods running Elasticsearch, each one will have a different fraction of the data, but internally the database system knows how to route a request to the specific server that has the datum in question.

    I'd recommend using a StatefulSet in preference to manually creating a PersistentVolumeClaim. For database workloads that can't be replicated, you can't set replicas: greater than 1 in either case, but the PVC management is valuable. You usually can't have multiple databases pointing at the same physical storage, containers or otherwise, and most types of Volumes can't be shared across Pods.