I am trying to create a microk8s cluster on GCP VM using Ansible. I want to Create a three node cluster, one master and two workers.
This is the playbook I am using;
---
- name: Ansible playbook to create Microk8s Cluster
hosts: kubenodes
become: true
tasks:
- name: Update web servers
apt:
upgrade: yes
update_cache: yes
- name: Install snap package installer
apt:
name: snapd
state: present
- name: Install Microk8s
snap:
name: microk8s
classic: true
- name: Add current user to microk8s group
shell:
cmd: "sudo usermod -a -G microk8s {{ ansible_user }}"
- hosts: "master"
become: true
vars:
workers_count: "{{ groups.workers | length }}"
tasks:
- name: Create join node command
shell: /snap/bin/microk8s add-node
register: join_token
ignore_errors: true
loop: "{{ range(1, num_iterations|int + 1 ) | list }}"
vars:
num_iterations: "{{ workers_count }}"
- set_fact:
microk8s_join_list: "{{ microk8s_join_list | default([]) + ['/snap/bin/' + item.stdout_lines[4]] }}"
loop: "{{ join_token.results }}"
- name: Add Cluster dashboard, Ingress and Cert Manager addon
shell: /snap/bin/microk8s enable community ingress dashboard cert-manager
- name: Add Argocd addon
shell: /snap/bin/microk8s enable argocd
- name: Store dashboard token
shell: /snap/bin/microk8s kubectl create token default
register: dashboard_token
ignore_errors: true
- name: Save Kubernetes dashboard token
set_fact:
dashboard_token: "{{ dashboard_token.stdout }}"
- name: Run command on worker nodes
shell: "{{ item }}"
delegate_to: "{{ groups.workers[ansible_loop.index0] }}"
loop: "{{ microk8s_join_list }}"
loop_control:
extended: true
This is my inventory
[master]
35.224.xx.xx
[workers]
107.178.xx.xx
35.225.xx.xx
[kubenodes:children]
workers
master
Every task works fine but the last task Run command on worker nodes
.
This is the error I am currently getting;
failed: [35.224.xx.xx -> 107.178.xx.xx] (item=/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker) => {"ansible_loop": {"allitems": ["/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker"], "first": true, "index": 1, "index0": 0, "last": false, "length": 2, "nextitem": "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker", "revindex": 2, "revindex0": 1}, "ansible_loop_var": "item", "item": "/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: Name does not resolve", "unreachable": true}
failed: [35.224.xx.xx -> 35.225.xx.xx] (item=/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker) => {"ansible_loop": {"allitems": ["/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker"], "first": false, "index": 2, "index0": 1, "last": true, "length": 2, "previtem": "/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "revindex": 1, "revindex0": 0}, "ansible_loop_var": "item", "item": "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker", "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: Name does not resolve", "unreachable": true}
fatal: [35.224.xx.xx -> {{ groups.workers[ansible_loop.index0] }}]: UNREACHABLE! => {"changed": false, "msg": "All items completed", "results": [{"ansible_loop": {"allitems": ["/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker"], "first": true, "index": 1, "index0": 0, "last": false, "length": 2, "nextitem": "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker", "revindex": 2, "revindex0": 1}, "ansible_loop_var": "item", "item": "/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: Name does not resolve", "unreachable": true}, {"ansible_loop": {"allitems": ["/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker"], "first": false, "index": 2, "index0": 1, "last": true, "length": 2, "previtem": "/snap/bin/microk8s join 10.128.0.26:25000/62733ae8aca2493665dc2fb3c8a7ec2a/04e060b6d262 --worker", "revindex": 1, "revindex0": 0}, "ansible_loop_var": "item", "item": "/snap/bin/microk8s join 10.128.0.26:25000/4dbfa58f2a996760ab51e41369d718ef/04e060b6d262 --worker", "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname inventory_hostname: Name does not resolve", "unreachable": true}]}
Please note, when modified for just one worker node the playbook works, but for a multi node cluster I get the above error.
I have tried creating a dummy node to hold the join command, I couldnt figure it out. I have also tried using debug
to printout groups.workers[ansible_loop.index0]
to see if I am getting the workers IP and offcourse to got printed out without any errors, but it doesnt just work for the main task.
I believe delegate_to
only works on a single host (and doesn't get reassigned in every loop operation), so it makes sense that you are seeing that error for groups.workers[ansible_loop.index0]
:
- name: Run command on worker nodes
shell: "{{ item }}"
delegate_to: "{{ groups.workers[ansible_loop.index0] }}"
loop: "{{ microk8s_join_list }}"
loop_control:
extended: true
in essence groups.workers[ansible_loop.index0]
when you have all workers would translate to something like 107.178.xx.xx 35.225.xx.xx
which is not a resolvable hostname. When you have a single worker it would work because it would resolve to something like 107.178.xx.xx
and that's resolvable.
✌️