mongodbkuberneteskubernetes-operator

MongoDB Community Kubernetes Operator and Custom Persistent Volumes


I'm trying to deploy a MongoDB replica set by using the MongoDB Community Kubernetes Operator in Minikube.
I followed the instructions on the official GitHub, so:

By default, the operator will creates three pods, each of them automatically linked to a new persistent volume claim bounded to a new persistent volume also created by the operator (so far so good).

However, I would like the data to be saved in a specific volume, mounted in a specific host path. So in order I would need to create three persistent volumes, each mounted to a specific host path, and then automatically I would want to configure the replicaset so that each pod would connect to its respective persistent volume (perhaps using the matchLabels selector). So I created three volumes by applying the following file:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongodb-pv-00
  namespace: $NAMESPACE
  labels: 
    type: local
    service: mongo
spec:
  storageClassName: manual
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  hostPath: 
    path: "/mnt/mongodata/00"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongodb-pv-01
  namespace: $NAMESPACE
  labels: 
    type: local
    service: mongo
spec:
  storageClassName: manual
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  hostPath: 
    path: "/mnt/mongodata/01"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongodb-pv-02
  namespace: $NAMESPACE
  labels: 
    type: local
    service: mongo
spec:
  storageClassName: manual
  capacity:
    storage: 5Gi
  accessModes:
  - ReadWriteOnce
  hostPath: 
    path: "/mnt/mongodata/02"

and then I set up the replica set configuration file in the following way, but it still fails to connect the pods to the volumes:

apiVersion: mongodbcommunity.mongodb.com/v1
kind: MongoDBCommunity
metadata:
  name: mongo-rs
  namespace: $NAMESPACE
spec:
  members: 3
  type: ReplicaSet
  version: "4.4.0"
  persistent: true
  podSpec:
    persistence:
      single: 
        labelSelector: 
          matchLabels:
            type: local
            service: mongo
        storage: 5Gi
        storageClass: manual
  statefulSet:
    spec:
      volumeClaimTemplates:
        - metadata:
            name: data-volume
          spec:
            accessModes: [ "ReadWriteOnce", "ReadWriteMany" ]
            resources:
              requests:
                storage: 5Gi
            selector:
              matchLabels:
                type: local
                service: mongo
            storageClassName: manual
  security:
    authentication:
      modes: ["SCRAM"]
  users:
    - ...
  additionalMongodConfig:
    storage.wiredTiger.engineConfig.journalCompressor: zlib

I can't find any documentation online, except the mongodb.com_v1_custom_volume_cr.yaml, has anyone faced this problem before? How could I make it work?


Solution

  • I think you could be interested into using local type of volumes. It works, like this:

    First, you create a storage class for the local volumes. Something like the following:

    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: local-storage
    provisioner: kubernetes.io/no-provisioner
    volumeBindingMode: WaitForFirstConsumer
    

    Since it has no-provisioner, it will be usable only if you manually create local PVs. WaitForFirstConsumer instead, will prevent attaching a PV to a PVC of a Pod which cannot be scheduled on the host on which the PV is available.

    Second, you create the local PVs. Similarly to how you created them in your example, something like this:

    apiVersion: v1
    kind: PersistentVolume
    metadata:
      name: example-pv
    spec:
      capacity:
        storage: 5Gi
      volumeMode: Filesystem
      accessModes:
      - ReadWriteOnce
      persistentVolumeReclaimPolicy: Retain
      storageClassName: local-storage
      local:
        path: /path/on/the/host
      nodeAffinity:
        required:
          nodeSelectorTerms:
          - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
              - the-node-hostname-on-which-the-storage-is-located
    

    Notice the definition, it tells the path on the host, the capacity.. and then it explains on which node of the cluster, such PV can be used (with the nodeAffinity). It also link them to the storage class we created early.. so that if someone (a claim template) requires storage with that class, it will now find this PV.

    You can create 3 PVs, on 3 different nodes.. or 3 PVs on the same node at different paths, you can organize things as you desire.

    Third, you can now use the local-storage class in claim template. The claim template could be something similar to this:

    volumeClaimTemplates:
      - metadata:
          name: the-name-of-the-pvc
        spec:
          accessModes: [ "ReadWriteOnce" ]
          storageClassName: "local-storage"
          resources:
            requests:
              storage: 5Gi
    

    And each Pod of the StatefulSet will try to be scheduled on a node with a local-storage PV available.


    Remember that with local storages or, in general, with volumes that utilize host paths.. you may want to spread the various Pods of your app on different nodes, so that the app may resist the failure of a single node on its own.


    In case you want to be able to decide which Pod links to which volume, the easiest way is to create one PV at a time, then wait for the Pod to Bound with it.. before creating the next one. It's not optimal but it's the easiest way.