kubernetespersistent-volumeskubernetes-statefulsetkubernetes-deployment

PVC in deployment and stateful set


I am in the early stage of learning kubernetes. I am not able to understand how persistent volumes works with replica set with HPA and statefulset with HPA.

Assume we deployed a pod with 1 replica and it has PVC. And pod is reading/writing data to the persistent volume. Now we increased the replica set to 2. How does the second pod get the persistent volume ? Does the second pod gets new persistent volume ? Does the second pod uses the same persistent voulme of first pod? Later we increased the replica set count to 6 and minimum replicas to 2 and maximum replicas to 6. Now according to the load new pods can be created or deleted. When the total pds count reached to 6, how the new pods gets the persistent volume? Does the new pod gets the dedicated new persistent volume ? Later due to load number of pods again reduced to minimum replicas of 2. Then what will happend to persistent volumes of the deleted pods.Later again pods count increased to 6. How they get the persistent volumes?

I have simiar question about stateful sets. In stateful set we have dedicated persistent volumes and DNS for pods. Assume we created statefulset with 2 pods. These 2 pods has dedicated persistent volume. Later We added replicas with minimum 2 and maximum 6 pods. Due to load if the pods count increases to 6 then how the new pods get the persistent storage? Does the new pods get allocated with new persistent storage? And later due to load , the number of pods reduced to 2. Then what will happen to the old pods persistent storages? Then again if the number of pods increased to 6, then how the new pods get the persistent storage?

Please help me in understanding this.

Thanks, Suresh

I am in early stage of leaning k8s. i am not able to understand PV in deployments and Statefulsets when they have replica with more than 1 pod.


Solution

  • If you have a Deployment, every replica of the Deployment is identical, aside from its name. In particular, every replica will share the same PersistentVolumeClaim and the same underlying PersistentVolume.

    PersistentVolumes are most often ReadWriteOnce type, which means they can only be accessed from a single node; if all of the replicas don't fit on the same node, this can prevent scaling up. There are generically problems that can happen when multiple processes try to write the same files, and you can see unexpected data corruption in this scenario. Some tools like databases use various forms of locking to prevent multiple processes from accessing the same underlying files, and in this case multiple replicas of a Deployment might just fail to start up entirely.

    Conversely, each replica of a StatefulSet gets its own PersistentVolumeClaim, assuming you use the volumeClaimTemplates: field to declare the PVC. If anything causes the StatefulSet to scale up, the new Pod will get a new empty PersistentVolume. If it scales down, the PV is preserved, and if it scales up again, the previous PV is reused.

    As general rules for stateful applications, I might suggest

    1. Don't attach PersistentVolumes to Deployments.
    2. If your application runs in a StatefulSet, it needs to be able to start up from a new empty data directory, and to coordinate data I/O across replicas.

    For many applications, a best practice will be to store all of the data in external data stores, possibly a separate database container in its own StatefulSet. The application itself shouldn't need persistent storage, and you can run it in a Deployment without a PersistentVolumeClaim.