Tried to install rook-ceph on kubernetes as this guide:
https://rook.io/docs/rook/v1.3/ceph-quickstart.html
git clone --single-branch --branch release-1.3 https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/ceph
kubectl create -f common.yaml
kubectl create -f operator.yaml
kubectl create -f cluster.yaml
When I check all the pods
$ kubectl -n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-9c2z9 3/3 Running 0 23m
csi-cephfsplugin-provisioner-7678bcfc46-s67hq 5/5 Running 0 23m
csi-cephfsplugin-provisioner-7678bcfc46-sfljd 5/5 Running 0 23m
csi-cephfsplugin-smmlf 3/3 Running 0 23m
csi-rbdplugin-provisioner-fbd45b7c8-dnwsq 6/6 Running 0 23m
csi-rbdplugin-provisioner-fbd45b7c8-rp85z 6/6 Running 0 23m
csi-rbdplugin-s67lw 3/3 Running 0 23m
csi-rbdplugin-zq4k5 3/3 Running 0 23m
rook-ceph-mon-a-canary-954dc5cd9-5q8tk 1/1 Running 0 2m9s
rook-ceph-mon-b-canary-b9d6f5594-mcqwc 1/1 Running 0 2m9s
rook-ceph-mon-c-canary-78b48dbfb7-z2t7d 0/1 Pending 0 2m8s
rook-ceph-operator-757d6db48d-x27lm 1/1 Running 0 25m
rook-ceph-tools-75f575489-znbbz 1/1 Running 0 7m45s
rook-discover-gq489 1/1 Running 0 24m
rook-discover-p9zlg 1/1 Running 0 24m
$ kubectl -n rook-ceph get pod -l app=rook-ceph-osd-prepare
No resources found in rook-ceph namespace.
Do some other operation
$ kubectl taint nodes $(hostname) node-role.kubernetes.io/master:NoSchedule-
$ kubectl -n rook-ceph-system delete pods rook-ceph-operator-757d6db48d-x27lm
Create file system
$ kubectl create -f filesystem.yaml
Check again
$ kubectl get pods -n rook-ceph -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
csi-cephfsplugin-9c2z9 3/3 Running 0 135m 192.168.0.53 kube3 <none> <none>
csi-cephfsplugin-provisioner-7678bcfc46-s67hq 5/5 Running 0 135m 10.1.2.6 kube3 <none> <none>
csi-cephfsplugin-provisioner-7678bcfc46-sfljd 5/5 Running 0 135m 10.1.2.5 kube3 <none> <none>
csi-cephfsplugin-smmlf 3/3 Running 0 135m 192.168.0.52 kube2 <none> <none>
csi-rbdplugin-provisioner-fbd45b7c8-dnwsq 6/6 Running 0 135m 10.1.1.6 kube2 <none> <none>
csi-rbdplugin-provisioner-fbd45b7c8-rp85z 6/6 Running 0 135m 10.1.1.5 kube2 <none> <none>
csi-rbdplugin-s67lw 3/3 Running 0 135m 192.168.0.52 kube2 <none> <none>
csi-rbdplugin-zq4k5 3/3 Running 0 135m 192.168.0.53 kube3 <none> <none>
rook-ceph-crashcollector-kube2-6d95bb9c-r5w7p 0/1 Init:0/2 0 110m <none> kube2 <none> <none>
rook-ceph-crashcollector-kube3-644c849bdb-9hcvg 0/1 Init:0/2 0 110m <none> kube3 <none> <none>
rook-ceph-mon-a-canary-954dc5cd9-6ccbh 1/1 Running 0 75s 10.1.2.130 kube3 <none> <none>
rook-ceph-mon-b-canary-b9d6f5594-k85w5 1/1 Running 0 74s 10.1.1.74 kube2 <none> <none>
rook-ceph-mon-c-canary-78b48dbfb7-kfzzx 0/1 Pending 0 73s <none> <none> <none> <none>
rook-ceph-operator-757d6db48d-nlh84 1/1 Running 0 110m 10.1.2.28 kube3 <none> <none>
rook-ceph-tools-75f575489-znbbz 1/1 Running 0 119m 10.1.1.14 kube2 <none> <none>
rook-discover-gq489 1/1 Running 0 135m 10.1.1.3 kube2 <none> <none>
rook-discover-p9zlg 1/1 Running 0 135m 10.1.2.4 kube3 <none> <none>
Can't see pod as rook-ceph-osd-
.
And rook-ceph-mon-c-canary-78b48dbfb7-kfzzx
pod is always Pending
.
If install toolbox as
https://rook.io/docs/rook/v1.3/ceph-toolbox.html
$ kubectl create -f toolbox.yaml
$ kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
Inside the container, check the ceph status
[root@rook-ceph-tools-75f575489-znbbz /]# ceph -s
unable to get monitor info from DNS SRV with service name: ceph-mon
[errno 2] error connecting to the cluster
It's running on Ubuntu 16.04.6.
Deploy again
$ kubectl -n rook-ceph get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
csi-cephfsplugin-4tww8 3/3 Running 0 3m38s 192.168.0.52 kube2 <none> <none>
csi-cephfsplugin-dbbfb 3/3 Running 0 3m38s 192.168.0.53 kube3 <none> <none>
csi-cephfsplugin-provisioner-7678bcfc46-8kt96 5/5 Running 0 3m37s 10.1.2.6 kube3 <none> <none>
csi-cephfsplugin-provisioner-7678bcfc46-kq6vv 5/5 Running 0 3m38s 10.1.1.6 kube2 <none> <none>
csi-rbdplugin-4qrqn 3/3 Running 0 3m39s 192.168.0.53 kube3 <none> <none>
csi-rbdplugin-dqx9z 3/3 Running 0 3m39s 192.168.0.52 kube2 <none> <none>
csi-rbdplugin-provisioner-fbd45b7c8-7f57t 6/6 Running 0 3m39s 10.1.2.5 kube3 <none> <none>
csi-rbdplugin-provisioner-fbd45b7c8-9zwhb 6/6 Running 0 3m39s 10.1.1.5 kube2 <none> <none>
rook-ceph-mon-a-canary-954dc5cd9-rgqpg 1/1 Running 0 2m40s 10.1.1.7 kube2 <none> <none>
rook-ceph-mon-b-canary-b9d6f5594-n2pwc 1/1 Running 0 2m35s 10.1.2.8 kube3 <none> <none>
rook-ceph-mon-c-canary-78b48dbfb7-fv46f 0/1 Pending 0 2m30s <none> <none> <none> <none>
rook-ceph-operator-757d6db48d-2m25g 1/1 Running 0 6m27s 10.1.2.3 kube3 <none> <none>
rook-discover-lpsht 1/1 Running 0 5m15s 10.1.1.3 kube2 <none> <none>
rook-discover-v4l77 1/1 Running 0 5m15s 10.1.2.4 kube3 <none> <none>
Describe pending pod
$ kubectl describe pod rook-ceph-mon-c-canary-78b48dbfb7-fv46f -n rook-ceph
Name: rook-ceph-mon-c-canary-78b48dbfb7-fv46f
Namespace: rook-ceph
Priority: 0
Node: <none>
Labels: app=rook-ceph-mon
ceph_daemon_id=c
mon=c
mon_canary=true
mon_cluster=rook-ceph
pod-template-hash=78b48dbfb7
rook_cluster=rook-ceph
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/rook-ceph-mon-c-canary-78b48dbfb7
Containers:
mon:
Image: rook/ceph:v1.3.4
Port: 6789/TCP
Host Port: 0/TCP
Command:
/tini
Args:
--
sleep
3600
Environment:
CONTAINER_IMAGE: ceph/ceph:v14.2.9
POD_NAME: rook-ceph-mon-c-canary-78b48dbfb7-fv46f (v1:metadata.name)
POD_NAMESPACE: rook-ceph (v1:metadata.namespace)
NODE_NAME: (v1:spec.nodeName)
POD_MEMORY_LIMIT: node allocatable (limits.memory)
POD_MEMORY_REQUEST: 0 (requests.memory)
POD_CPU_LIMIT: node allocatable (limits.cpu)
POD_CPU_REQUEST: 0 (requests.cpu)
ROOK_CEPH_MON_HOST: <set to the key 'mon_host' in secret 'rook-ceph-config'> Optional: false
ROOK_CEPH_MON_INITIAL_MEMBERS: <set to the key 'mon_initial_members' in secret 'rook-ceph-config'> Optional: false
ROOK_POD_IP: (v1:status.podIP)
Mounts:
/etc/ceph from rook-config-override (ro)
/etc/ceph/keyring-store/ from rook-ceph-mons-keyring (ro)
/var/lib/ceph/crash from rook-ceph-crash (rw)
/var/lib/ceph/mon/ceph-c from ceph-daemon-data (rw)
/var/log/ceph from rook-ceph-log (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-65xtn (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
rook-config-override:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: rook-config-override
Optional: false
rook-ceph-mons-keyring:
Type: Secret (a volume populated by a Secret)
SecretName: rook-ceph-mons-keyring
Optional: false
rook-ceph-log:
Type: HostPath (bare host directory volume)
Path: /var/lib/rook/rook-ceph/log
HostPathType:
rook-ceph-crash:
Type: HostPath (bare host directory volume)
Path: /var/lib/rook/rook-ceph/crash
HostPathType:
ceph-daemon-data:
Type: HostPath (bare host directory volume)
Path: /var/lib/rook/mon-c/data
HostPathType:
default-token-65xtn:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-65xtn
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 22s (x3 over 84s) default-scheduler 0/3 nodes are available: 1 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 2 node(s) didn't match pod affinity/anti-affinity, 2 node(s) didn't satisfy existing pods anti-affinity rules.
Test mount
Create a nginx.yaml file
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumes:
- name: www
flexVolume:
driver: ceph.rook.io/rook
fsType: ceph
options:
fsName: myfs
clusterNamespace: rook-ceph
Deploy it and describe the pod detail
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9m28s default-scheduler Successfully assigned default/nginx to kube2
Warning FailedMount 9m28s kubelet, kube2 Unable to attach or mount volumes: unmounted volumes=[www default-token-fnb28], unattached volumes=[www default-token-fnb28]: failed to get Plugin from volumeSpec for volume "www" err=no volume plugin matched
Warning FailedMount 6m14s (x2 over 6m38s) kubelet, kube2 Unable to attach or mount volumes: unmounted volumes=[www], unattached volumes=[default-token-fnb28 www]: failed to get Plugin from volumeSpec for volume "www" err=no volume plugin matched
Warning FailedMount 4m6s (x23 over 9m13s) kubelet, kube2 Unable to attach or mount volumes: unmounted volumes=[www], unattached volumes=[www default-token-fnb28]: failed to get Plugin from volumeSpec for volume "www" err=no volume plugin matched
rook-ceph-mon-x pods have following affinity:
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: rook-ceph-mon
topologyKey: kubernetes.io/hostname
which doesn't allow for running 2 rook-ceph-mon pods on the same node. Since you seem to have 3 nodes: 1 master and 2 workers, 2 pods get created, one on kube2 and one on kube3 node. kube1 is master node tainted as unschedulable so rook-ceph-mon-c cannot be scheduled there.
To solve it you can:
kubectl taint nodes kube1 key:NoSchedule-