I have installed the latest version of okd4 on a 5 node cluster where 3 control-planes and compute nodes.
When running oc get co
I am seing the following error messages at the machine-config
NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
authentication 4.10.0-0.okd-2022-07-09-073606 True False False 7h14m
baremetal 4.10.0-0.okd-2022-07-09-073606 True False False 15h
cloud-controller-manager 4.10.0-0.okd-2022-07-09-073606 True False False 15h
cloud-credential 4.10.0-0.okd-2022-07-09-073606 True False False 15h
cluster-autoscaler 4.10.0-0.okd-2022-07-09-073606 True False False 15h
config-operator 4.10.0-0.okd-2022-07-09-073606 True False False 15h
console 4.10.0-0.okd-2022-07-09-073606 True False False 7h14m
csi-snapshot-controller 4.10.0-0.okd-2022-07-09-073606 True False False 13h
dns 4.10.0-0.okd-2022-07-09-073606 True False False 13h
etcd 4.10.0-0.okd-2022-07-09-073606 True False False 13h
image-registry 4.10.0-0.okd-2022-07-09-073606 True False False 3h1m
ingress 4.10.0-0.okd-2022-07-09-073606 True False False 8h
insights 4.10.0-0.okd-2022-07-09-073606 True False False 14h
kube-apiserver 4.10.0-0.okd-2022-07-09-073606 True False False 13h
kube-controller-manager 4.10.0-0.okd-2022-07-09-073606 True False False 13h
kube-scheduler 4.10.0-0.okd-2022-07-09-073606 True False False 14h
kube-storage-version-migrator 4.10.0-0.okd-2022-07-09-073606 True False False 13h
machine-api 4.10.0-0.okd-2022-07-09-073606 True False False 14h
machine-approver 4.10.0-0.okd-2022-07-09-073606 True False False 15h
machine-config True True True 13h Unable to apply 4.10.0-0.okd-2022-07-09-073606: timed out waiting for the condition during syncRequiredMachineConfigPools: error pool master is not ready, retrying. Status: (pool degraded: true total: 3, ready 0, updated: 0, unavailable: 3)
marketplace 4.10.0-0.okd-2022-07-09-073606 True False False 15h
monitoring 4.10.0-0.okd-2022-07-09-073606 True False False 8h
network 4.10.0-0.okd-2022-07-09-073606 True False False 13h
node-tuning 4.10.0-0.okd-2022-07-09-073606 True False False 8h
openshift-apiserver 4.10.0-0.okd-2022-07-09-073606 True False False 13h
openshift-controller-manager 4.10.0-0.okd-2022-07-09-073606 True False False 33m
openshift-samples 4.10.0-0.okd-2022-07-09-073606 True False False 13h
operator-lifecycle-manager 4.10.0-0.okd-2022-07-09-073606 True False False 14h
operator-lifecycle-manager-catalog 4.10.0-0.okd-2022-07-09-073606 True False False 14h
operator-lifecycle-manager-packageserver 4.10.0-0.okd-2022-07-09-073606 True False False 13h
service-ca 4.10.0-0.okd-2022-07-09-073606 True False False 15h
storage 4.10.0-0.okd-2022-07-09-073606 True False False 15h
when running oc get mcp
I am getting:
oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master False True True 3 0 0 3 15h
worker rendered-worker-04b4cdd431c21b96c1f98ca595ded448 True False False 2 2 2 0 15h
and when I describe the degraded machine config I see the following:
oc describe mcp master
Name: master
Namespace:
Labels: machineconfiguration.openshift.io/mco-built-in=
operator.machineconfiguration.openshift.io/required-for-upgrade=
pools.operator.machineconfiguration.openshift.io/master=
Annotations: <none>
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfigPool
Metadata:
Creation Timestamp: 2022-07-24T03:25:28Z
Generation: 2
Managed Fields:
API Version: machineconfiguration.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.:
f:machineconfiguration.openshift.io/mco-built-in:
f:operator.machineconfiguration.openshift.io/required-for-upgrade:
f:pools.operator.machineconfiguration.openshift.io/master:
f:spec:
.:
f:configuration:
f:machineConfigSelector:
.:
f:matchLabels:
.:
f:machineconfiguration.openshift.io/role:
f:nodeSelector:
.:
f:matchLabels:
.:
f:node-role.kubernetes.io/master:
f:paused:
Manager: machine-config-operator
Operation: Update
Time: 2022-07-24T03:25:28Z
API Version: machineconfiguration.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:spec:
f:configuration:
f:name:
f:source:
Manager: machine-config-controller
Operation: Update
Time: 2022-07-24T05:05:35Z
API Version: machineconfiguration.openshift.io/v1
Fields Type: FieldsV1
fieldsV1:
f:status:
.:
f:conditions:
f:configuration:
f:degradedMachineCount:
f:machineCount:
f:observedGeneration:
f:readyMachineCount:
f:unavailableMachineCount:
f:updatedMachineCount:
Manager: machine-config-controller
Operation: Update
Subresource: status
Time: 2022-07-24T05:05:40Z
Resource Version: 41348
UID: 6eea1467-dfd1-4e25-a0a5-a303d21c4076
Spec:
Configuration:
Name: rendered-master-5ac7b1a497e20b76e47aaf715bc0dc6f
Source:
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 00-master
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 01-master-container-runtime
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 01-master-kubelet
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-master-generated-crio-seccomp-use-default
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-master-generated-registries
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-master-okd-extensions
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-master-ssh
API Version: machineconfiguration.openshift.io/v1
Kind: MachineConfig
Name: 99-okd-master-disable-mitigations
Machine Config Selector:
Match Labels:
machineconfiguration.openshift.io/role: master
Node Selector:
Match Labels:
node-role.kubernetes.io/master:
Paused: false
Status:
Conditions:
Last Transition Time: 2022-07-24T05:05:36Z
Message:
Reason:
Status: False
Type: RenderDegraded
Last Transition Time: 2022-07-24T05:05:40Z
Message:
Reason:
Status: False
Type: Updated
Last Transition Time: 2022-07-24T05:05:40Z
Message: All nodes are updating to rendered-master-5ac7b1a497e20b76e47aaf715bc0dc6f
Reason:
Status: True
Type: Updating
Last Transition Time: 2022-07-24T05:05:40Z
Message:
Reason:
Status: True
Type: Degraded
Last Transition Time: 2022-07-24T05:05:40Z
Message: Node okd4-control-plane-1 is reporting: "machineconfig.machineconfiguration.openshift.io \"rendered-master-d06288fa8a499313709afdb2c727de31\" not found", Node okd4-control-plane-2 is reporting: "machineconfig.machineconfiguration.openshift.io \"rendered-master-d06288fa8a499313709afdb2c727de31\" not found", Node okd4-control-plane-3 is reporting: "machineconfig.machineconfiguration.openshift.io \"rendered-master-d06288fa8a499313709afdb2c727de31\" not found"
Reason: 3 nodes are reporting degraded status on sync
Status: True
Type: NodeDegraded
Configuration:
Degraded Machine Count: 3
Machine Count: 3
Observed Generation: 2
Ready Machine Count: 0
Unavailable Machine Count: 3
Updated Machine Count: 0
Events: <none>
Any suggestion how to solve this?
Fixed it by deleting the master mcp which triggered it to be recreated and then everything got clean.
oc delete mcp master