I had created a single node Kubernetes Cluster and the hostname of the master node has been changed which is causing the node to be in Not Ready State. The etcd entry of the cluster shows the same and this is what I get when checking kubelet status with systemctl status kubelet
:
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2024-08-12 16:26:26 IST; 11min ago
Docs: https://kubernetes.io/docs/home/
Main PID: 500476 (kubelet)
Tasks: 51 (limit: 154427)
Memory: 42.2M
CPU: 25.482s
CGroup: /system.slice/kubelet.service
└─500476 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock --pod-in>
Aug 12 16:38:06 AIRAITD0258 kubelet[500476]: I0812 16:38:06.007224 500476 kubelet_node_status.go:70] "Attempting to register node" node="airaitd0258"
Aug 12 16:38:06 AIRAITD0258 kubelet[500476]: E0812 16:38:06.010747 500476 kubelet_node_status.go:92] "Unable to register node with API server" err="nodes \"airaitd0258\" is forbidden: node \"master-node\" is not allowed to modify node \"airaitd0258\"" node="ai>
Aug 12 16:38:06 AIRAITD0258 kubelet[500476]: E0812 16:38:06.836547 500476 eviction_manager.go:258] "Eviction manager: failed to get summary stats" err="failed to get node info: node \"airaitd0258\" not found"
Aug 12 16:38:12 AIRAITD0258 kubelet[500476]: E0812 16:38:12.572630 500476 controller.go:146] "Failed to ensure lease exists, will retry" err="leases.coordination.k8s.io \"airaitd0258\" is forbidden: User \"system:node:master-node\" cannot get resource \"leases>
Is there any way in which this node can be recovered?
The only solution I have come across is this, but it describes the process for a worker node rather than a master node : How to change name of a kubernetes node
Resetting and then re-initializing the cluster is the easiest and safest method.
Similarly, kubeadm reset is needed if the hostname of a worker node changes. You can reset using the command sudo kubeadm reset
. For reinitializing the cluster after resetting by running the command : sudo kubeadm
init.
It might be possible to do this by stopping the cluster and manually editing etcd
. Renewal of certificates and all other updates. Please refer to Operating etcd clusters for Kubernetes.
As an alternative way there are backup/migration solutions, you can use the Velero tool to safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes.
I would also suggest updating the cluster version to at least 1.19. Follow Upgrading kubeadm clusters for more details.