kubernetespersistent-storage

Is Kubernetes local/csi PV content synced into a new node?


According to the documentation:

A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned ... It is a resource in the cluster just like a node is a cluster resource...

So I was reading about all currently available plugins for PVs and I understand that for 3rd-party / out-of-cluster storage this doesn't matter (e.g. storing data in EBS, Azure or GCE disks) because there are no or very little implications when adding or removing nodes from a cluster. However, there are different ones such as (ignoring hostPath as that works only for single-node clusters):

which (at least from what I've read in the docs) don't require 3rd-party vendors/software.

But also:

... local volumes are subject to the availability of the underlying node and are not suitable for all applications. If a node becomes unhealthy, then the local volume becomes inaccessible by the pod. The pod using this volume is unable to run. Applications using local volumes must be able to tolerate this reduced availability, as well as potential data loss, depending on the durability characteristics of the underlying disk.

The local PersistentVolume requires manual cleanup and deletion by the user if the external static provisioner is not used to manage the volume lifecycle.

Use-case

Let's say I have a single-node cluster with a single local PV and I want to add a new node to the cluster, so I have 2-node cluster (small numbers for simplicity).

Will the data from an already existing local PV be 1:1 replicated into the new node as in having one PV with 2 nodes of redundancy or is it strictly bound to the existing node only?

If the already existing PV can't be adjusted from 1 to 2 nodes, can a new PV (created from scratch) be created so it's 1:1 replicated between 2+ nodes on the cluster?

Alternatively if not, what would be the correct approach without using a 3rd-party out-of-cluster solution? Will using csi cause any change to the overall approach or is it the same with redundancy, just different "engine" under the hood?


Solution

  • Can a new PV be created so it's 1:1 replicated between 2+ nodes on the cluster?

    None of the standard volume types are replicated at all. If you can use a volume type that supports ReadWriteMany access (most readily NFS) then multiple pods can use it simultaneously, but you would have to run the matching NFS server.

    Of the volume types you reference:

    In the specific case of a local test cluster (say, using kind) using a local volume will constrain pods to run on the node that has the data, which is more robust than using a hostPath volume. It won't replicate the data, though, so if the node with the data is deleted, the data goes away with it.