kubernetesfedora-25kubeletrkt

kubelet fails to start with rocket


I would like to request advice from the Kubernetes community on the Kubelet problem I described in https://github.com/rkt/rkt/issues/3647. Please find brief summary of the issue copied here and the full details in the Github link above.

Summary

I have problem setting a dev cluster on my Fedora vm... I believe everything is properly configured and can't find a good hint on what's wrong.
My plan is to try Kubernetes 1.6 with Rocket 1.25 on Fedora 25 Workstation. Everything seems to start, but (I suspect) the Kubelet is unable to communicate to the RKT api.
I tried to use the manual installation procedure (url removed see git) and some of the example systemd services from the Kubernetes git repository. For the change from docker to git I tried following the hints in (url removed see git).
I prefer to do the whole setup from scratch in order to understand better the involved components, difference and effort to switch to rocket and to use 1.6 (Fedora repo has 1.5 rpm only).
I suspect that the problem is the :

"rkt grpc: Server.Serve failed to create ServerTransport: connection error: desc = "transport: write tcp [::1]:15441->[::1]:46834: write: broken pipe"

message below, but not sure, if my assumption is correct and how to fix it.

Could you please help me to find out this is software defect or configuration mistake?

Error

[root@pp-fed-vm ppavlov]# systemctl restart kubelet; sleep 5; systemctl status kubelet
-> /usr/bin/kubelet --logtostderr=true --v=0 --api-servers=http://127.0.0.1:8080 --address=127.0.0.1 --hostname-override=pp-fed-vm --allow-privileged=false --container-runtime=rkt --rkt-api-endpoint=127.0.0.1:15441 --cgroup-driver=systemd

kubelet[23222]: I0416 20:54:54.996239 23222 server.go:294] Adding debug handlers to kubelet server.
kubelet[23222]: I0416 20:54:55.081344 23222 kubelet_node_status.go:230] Setting node annotation to enable volume controller attach/detach
kubelet[23222]: E0416 20:54:55.092550 23222 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": unable to find data for container /
kubelet[23222]: E0416 20:54:55.092649 23222 kubelet.go:1669] Failed to check if disk space is available on the root partition: failed to get fs info for "root": unable to find data for container /
kubelet[23222]: I0416 20:54:55.092759 23222 kubelet_node_status.go:77] Attempting to register node pp-fed-vm
kubelet[23222]: I0416 20:54:55.096327 23222 kubelet_node_status.go:128] Node pp-fed-vm was previously registered
kubelet[23222]: I0416 20:54:55.096374 23222 kubelet_node_status.go:80] Successfully registered node pp-fed-vm
kubelet[23222]: E0416 20:54:55.109909 23222 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": unable to find data for container /
kubelet[23222]: E0416 20:54:55.109939 23222 kubelet.go:1669] Failed to check if disk space is available on the root partition: failed to get fs info for "root": unable to find data for container /
kubelet[23222]: I0416 20:54:55.109970 23222 kubelet_node_status.go:682] Node became not ready: {Type:Ready Status:False LastHeartbeatTime:2017-04-16 20:54:55.10995563 +020

 [root@pp-fed-vm k8s]# journalctl -xe

kubelet[23362]: I0416 20:57:11.736831    23362 feature_gate.go:144] feature gates: map[]
kubelet[23362]: W0416 20:57:11.736896    23362 server.go:715] Could not load kubeconfig file /var/lib/kubelet/kubeconfig: stat /var/lib/kubelet/kubeconfig:no such file or directory. Using default client config instead.
kubelet[23362]: W0416 20:57:11.737389    23362 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
kubelet[23362]: I0416 20:57:11.743870    23362 manager.go:143] cAdvisor running in container: "/system.slice"
rkt[23022]     : 2017/04/16 20:57:11 grpc: Server.Serve failed to create ServerTransport:  connection error: desc = "transport: write tcp [::1]:15441->[::1]:35570: write: broken pipe"
kubelet[23362]: I0416 20:57:11.784074    23362 fs.go:117] Filesystem partitions: map[/dev/mapper/fedora_pp--fed--vm-root:{mountpoint:/ major:253 minor:0 fsType:ext4 blockSize:0} /dev/sda1:{mountpoint:/boot major:8 minor:1 fsType:ext4 blockSize:0}
kubelet[23362]: I0416 20:57:11.785244    23362 manager.go:198] Machine: {NumCores:4 CpuFrequency:2893314 MemoryCapacity:8370282496 MachineID:faacc8d3edd840e6b32933eb5ca9217f SystemUUID:2E6067FA-A4D6-42F3-8FA7-86BD63DEEF74 BootID:643ca77e-bf97-4794-b9a6-f4bab0efaeab 
kubelet[23362]: tanceID:None}
kubelet[23362]: I0416 20:57:11.793513    23362 manager.go:204] Version: {KernelVersion:4.10.9-200.fc25.x86_64 ContainerOsVersion:Fedora 25 (Workstation Edition) DockerVersion:1.12.6 CadvisorVersion: CadvisorRevision:}
kubelet[23362]: I0416 20:57:11.796793    23362 server.go:509] --cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /
kubelet[23362]: W0416 20:57:11.798146    23362 container_manager_linux.go:218] Running with swap on is not supported, please disable swap! This will be a fatal error by defaut starting in K8s v1.6!
kubelet[23362]: I0416 20:57:11.798224    23362 container_manager_linux.go:245] container manager verified user specified cgroup-root exists: /
kubelet[23362]: I0416 20:57:11.798233    23362 container_manager_linux.go:250] Creating Container Manager object based on Node Config: {RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:rkt CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd
kubelet[23362]: I0416 20:57:11.798384    23362 kubelet.go:265] Watching apiserver
kubelet[23362]: W0416 20:57:11.800709    23362 kubelet_network.go:63] Hairpin mode set to "promiscuous-bridge" but container runtime is "rkt", ignoring
kubelet[23362]: I0416 20:57:11.800945    23362 kubelet.go:494] Hairpin mode set to "none"
kubelet[23362]: I0416 20:57:11.807565    23362 server.go:869] Started kubelet v1.6.1
kubelet[23362]: E0416 20:57:11.808028    23362 kubelet.go:1165] Image garbage collection failed: unable to find data for container /
kubelet[23362]: I0416 20:57:11.808693    23362 fs_resource_analyzer.go:66] Starting FS ResourceAnalyzer
kubelet[23362]: I0416 20:57:11.808719    23362 status_manager.go:140] Starting to sync pod status with apiserver
kubelet[23362]: I0416 20:57:11.808725    23362 kubelet.go:1741] Starting kubelet main sync loop.
kubelet[23362]: I0416 20:57:11.808849    23362 kubelet.go:1752] skipping pod synchronization - [container runtime is down PLEG is not healthy: pleg was last seen active 25620
kubelet[23362]: I0416 20:57:11.809008    23362 server.go:127] Starting to listen on 127.0.0.1:10250
kubelet[23362]: I0416 20:57:11.809547    23362 server.go:294] Adding debug handlers to kubelet server.
kubelet[23362]: W0416 20:57:11.811322    23362 container_manager_linux.go:741] CPUAccounting not enabled for pid: 23362
kubelet[23362]: W0416 20:57:11.811333    23362 container_manager_linux.go:744] MemoryAccounting not enabled for pid: 23362
kubelet[23362]: I0416 20:57:11.811389    23362 volume_manager.go:248] Starting Kubelet Volume Manager
kubelet[23362]: E0416 20:57:11.814453    23362 kubelet.go:2058] Container runtime status is nil
audit: NETFILTER_CFG table=nat family=2 entries=37
audit: NETFILTER_CFG table=nat family=2 entries=37
audit: NETFILTER_CFG table=nat family=2 entries=37
kubelet[23362]: I0416 20:57:11.911763    23362 kubelet_node_status.go:230] Setting node annotation to enable volume controller attach/detach
kubelet[23362]: E0416 20:57:11.991246    23362 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": unable to find data for container /
kubelet[23362]: E0416 20:57:11.991281    23362 kubelet.go:1669] Failed to check if disk space is available on the root partition: failed to get fs info for "root": unable to find data for container /
kubelet[23362]: I0416 20:57:11.991382    23362 kubelet_node_status.go:77] Attempting to register node pp-fed-vm
kubelet[23362]: I0416 20:57:11.998021    23362 kubelet_node_status.go:128] Node pp-fed-vm was previously registered
kubelet[23362]: I0416 20:57:11.998040    23362 kubelet_node_status.go:80] Successfully registered node pp-fed-vm
kubelet[23362]: E0416 20:57:12.021057    23362 kubelet.go:1661] Failed to check if disk space is available for the runtime: failed to get fs info for "runtime": unable to find data for container /
kubelet[23362]: E0416 20:57:12.021083    23362 kubelet.go:1669] Failed to check if disk space is available on the root partition: failed to get fs info for "root": unable to find data for container / 
kubelet[23362]: I0416 20:57:16.810561    23362 kubelet.go:1752] skipping pod synchronization - [container runtime is down]
kubelet[23362]: E0416 20:57:16.816235    23362 kubelet.go:2058] Container runtime status is n

Solution

  • The error message Container runtime status is nil indicates a problem where kubelet falsely thinks that the container runtime is not up. This would make kubelet report that the node is not ready. You can verify this by checking kubectl describe node <node_name> to see the node's current status. I created this patch to fix the bug. Hopefully the fix will make the kubernetes 1.6.2 release.