I have a problem with my Kubernetes cluster where my kube-scheduler pod is stuck in the 'CrashLoopBackOff' state and I am unable to rectify it. the logs are complaining of a missing service token:
kubectl logs kube-scheduler-master -n kube-system
I1011 09:01:04.309289 1 serving.go:319] Generated self-signed cert in-memory
W1011 09:01:20.579733 1 authentication.go:387] failed to read in-cluster kubeconfig for delegated authentication: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W1011 09:01:20.579889 1 authentication.go:249] No authentication-kubeconfig provided in order to lookup client-ca-file in configmap/extension-apiserver-authentication in kube-system, so client certificate authentication won't work.
W1011 09:01:20.579917 1 authentication.go:252] No authentication-kubeconfig provided in order to lookup requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
W1011 09:01:20.579990 1 authorization.go:177] failed to read in-cluster kubeconfig for delegated authorization: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
W1011 09:01:20.580040 1 authorization.go:146] No authorization-kubeconfig provided, so SubjectAccessReview of authorization tokens won't work.
invalid configuration: no configuration has been provided
Can anyone please explain what /var/run/secrets/kubernetes.io/serviceaccount/token
is, where is it supposed to be stored (is the path on the host or within the container) and how do I go about regenerating it?
I'm running version 1.15.4 across all of my nodes which were set up using kubeadm
. I have stupidly upgrade the cluster since this error first started (I read that it could possibly be a bug in the version I was using). I was previously using version 1.14.*.
Any help would be greatly appreciated; everything runs on this cluster and I feel like my arms have been cut off with out it.
Thanks in advance,
Harry
It turns out that, as the pod is kube-scheduler
, the /var/run/secrets/kubernetes.io/serviceaccount/token
the logs are referring to are is mounted from /etc/kubernetes/scheduler.conf
on the master node.
For whatever reason, this was a completely empty file in my cluster. I regenerated it by following the instructions for kube-scheduler on Kubernetes the hard way:
I ran the following in the /etc/kubernetes/pki
directory (where the original CAs remained):
{
cat > kube-scheduler-csr.json <<EOF
{
"CN": "system:kube-scheduler",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "US",
"L": "Portland",
"O": "system:kube-scheduler",
"OU": "Kubernetes The Hard Way",
"ST": "Oregon"
}
]
}
EOF
cfssl gencert \
-ca=ca.pem \
-ca-key=ca-key.pem \
-config=ca-config.json \
-profile=kubernetes \
kube-scheduler-csr.json | cfssljson -bare kube-scheduler
}
which generates kube-scheduler-key.pem
and kube-scheduler.pem
.
Next, I needed to generate the new config file using the instructions here.
I ran:
{
kubectl config set-cluster kubernetes-the-hard-way \
--certificate-authority=ca.pem \
--embed-certs=true \
--server=https://127.0.0.1:6443 \
--kubeconfig=kube-scheduler.kubeconfig
kubectl config set-credentials system:kube-scheduler \
--client-certificate=kube-scheduler.pem \
--client-key=kube-scheduler-key.pem \
--embed-certs=true \
--kubeconfig=kube-scheduler.kubeconfig
kubectl config set-context default \
--cluster=kubernetes-the-hard-way \
--user=system:kube-scheduler \
--kubeconfig=kube-scheduler.kubeconfig
kubectl config use-context default --kubeconfig=kube-scheduler.kubeconfig
}
which generates kube-scheduler.kubeconfig
which I renamed and moved to /etc/kubernetes/scheduler.conf
.
It was then just a case of reading the logs from the pod (kubectl logs kube-scheduler-xxxxxxx -n kube-system
) which will complain about various things missing from the configuration file.
These were the 'clusters' and 'contexts' blocks of the YAML which I copied from one of the other configuration files in the same directory (after verifying that they were all identical).
After copying those into scheduler.conf
the errors stopped and everything kicked back into life.