I ran Kubespray in lxc containers with below configuration:(server_ram:8G | all nodes in ubuntu:18.04)
| NAME | STATE | IPV4
+---------+---------+-------------------
| ansible | RUNNING | 10.21.185.23 (eth0)
| node1 | RUNNING | 10.21.185.158 (eth0)
| node2 | RUNNING | 10.21.185.186 (eth0)
| node3 | RUNNING | 10.21.185.65 (eth0)
| node4 | RUNNING | 10.21.185.106 (eth0)
| node5 | RUNNING | 10.21.185.14 (eth0)
In root@ansible: when i ran kubespray command to build cluster i encountered with this Error:
TASK [kubernetes/preinstall : Disable swap] ******************
fatal: [node1]: FAILED! => {"changed": true, "cmd": ["/sbin/swapoff", "-a"], "delta": "0:00:00.020302", "end": "2020-05-13 07:21:24.974910", "msg": "non-zero return code", "rc": 255, "start": "2020-05-13 07:21:24.954608", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
fatal: [node2]: FAILED! => {"changed": true, "cmd": ["/sbin/swapoff", "-a"], "delta": "0:00:00.010084", "end": "2020-05-13 07:21:25.051443", "msg": "non-zero return code", "rc": 255, "start": "2020-05-13 07:21:25.041359", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
fatal: [node3]: FAILED! => {"changed": true, "cmd": ["/sbin/swapoff", "-a"], "delta": "0:00:00.008382", "end": "2020-05-13 07:21:25.126695", "msg": "non-zero return code", "rc": 255, "start": "2020-05-13 07:21:25.118313", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
fatal: [node4]: FAILED! => {"changed": true, "cmd": ["/sbin/swapoff", "-a"], "delta": "0:00:00.006829", "end": "2020-05-13 07:21:25.196145", "msg": "non-zero return code", "rc": 255, "start": "2020-05-13 07:21:25.189316", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
lxc containers configuration:(include:node1,node2,node3,node4,node5)
architecture: x86_64
config:
image.architecture: amd64
image.description: ubuntu 18.04 LTS amd64 (release) (20200506)
image.label: release
image.os: ubuntu
image.release: bionic
image.serial: "20200506"
image.version: "18.04"
limits.cpu: "2"
limits.memory: 2GB
limits.memory.swap: "false"
linux.kernel_modules: ip_tables,ip6_tables,netlink_diag,nf_nat,overlay
raw.lxc: "lxc.apparmor.profile=unconfined\nlxc.cap.drop= \nlxc.cgroup.devices.allow=a\nlxc.mount.auto=proc:rw
sys:rw"
security.nesting: "true"
security.privileged: "true"
volatile.base_image: 93b9eeb85479af2029203b4a56a2f1fdca6a0e1bf23cdc26b567790bf0f3f3bd
volatile.eth0.hwaddr: 00:16:3e:5a:91:9a
volatile.idmap.base: "0"
volatile.idmap.next: '[]'
volatile.last_state.idmap: '[]'
volatile.last_state.power: RUNNING
devices: {}
ephemeral: false
profiles:
- default
stateful: false
description: ""
When i try to swapoff manually in nodes i receive nothing.
root@node1:~# /sbin/swapoff -a
root@node1:~#
it will be so helpful if anyone has an idea.
I divided this answer on 2 parts:
swapoff -a
Kubespray
fails because he gets non exit zero code (255) when running swapoff -a
.
A non-zero exit status indicates failure. This seemingly counter-intuitive scheme is used so there is one well-defined way to indicate success and a variety of ways to indicate various failure modes.
Even if you set limits.memory.swap: "false"
in the profile associated with the containers it will still produce this error.
There is a workaround for it by disabling swap in your host system. You can do it by:
$ swapoff -a
/etc/fstab
$ reboot
After that your container should produce zero exit code when issuing
$ swapoff -a
Assuming that you created your lxc
containers and have full ssh access to them, there are still things to take into consideration before running kubespray
.
I ran kubespray
on lxc
containers and stumbled upon issues with:
kmsg
conntrack
Please make sure you have enough storage within your storage pool as lack of it will result in failure to provision the cluster. Default storage pool size could be not big enough to hold 5 nodes.
When provisioning the cluster please make sure that you have the newest kubespray
version available as the older ones had an issue with docker packages not compatible with each other.
The /dev/kmsg character device node provides userspace access to the kernel's printk buffer.
By default kubespray
will fail to provision the cluster when the /dev/kmsg
is not available on the node (lxc container).
/dev/kmsg
is not available on lxc
container and this will cause a failure of kubespray
provisioning.
There is a workaround for it. In each lxc
container run:
# Hack required to provision K8s v1.15+ in LXC containers mknod /dev/kmsg c 1 11 chmod +x /etc/rc.d/rc.local echo 'mknod /dev/kmsg c 1 11' >> /etc/rc.d/rc.local
Github.com: Justmeandopensource: lxd-provisioning: bootstrap-kube.sh
I tried other workarounds like:
lxc.kmsg = 1
to /etc/lxc/default.conf
- deprecated echo 'L /dev/kmsg - - - - /dev/console' > /etc/tmpfiles.d/kmsg.conf
inside the container and then restarting is causing the systemd-journald
to sit at 100% usage of a core.The LXC/LXD system containers do not load kernel modules for their own use. What you do, is get the host it load the kernel module, and this module could be available in the container.
Linuxcontainers.org: How to add kernel modules to LXC container
Kubespray
will check if certain kernel modules are available within your nodes.
You will need to add following modules on your host:
ip_vs
ip_vs_sh
ip_vs_rr
ip_vs_wrr
You can add above modules with $ modprobe MODULE_NAME
or follow this link: Cyberciti.biz: Linux how to load a kernel module automatically.
You will need to install conntrack
and load a module named nf_conntrack
:
$ apt install conntrack -y
modprobe nf_conntrack
Without above commands kubespray
will fail on step of checking the availability of conntrack
.
With this change in place you should be able to run Kubernetes cluster with kubespray
within lxc
environment and get output of nodes similar to this:
root@k8s1:~# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s1 Ready master 14h v1.18.2 10.224.47.185 <none> Ubuntu 18.04.4 LTS 5.4.0-31-generic docker://18.9.7
k8s2 Ready master 14h v1.18.2 10.224.47.98 <none> Ubuntu 18.04.4 LTS 5.4.0-31-generic docker://18.9.7
k8s3 Ready <none> 14h v1.18.2 10.224.47.46 <none> Ubuntu 18.04.4 LTS 5.4.0-31-generic docker://18.9.7
k8s4 Ready <none> 14h v1.18.2 10.224.47.246 <none> Ubuntu 18.04.4 LTS 5.4.0-31-generic docker://18.9.7