Vagrantfile docker provider network errors

I'm attempting to use the following files to create 3 k8 nodes using the docker provider and ansible.

Vagrantfile:

IMAGE_NAME = "roboxes/rhel8"
N = 3

Vagrant.configure("2") do |config|
    config.ssh.insert_key = false

    config.vm.provider "docker" do |v|
    end
      
    config.vm.define "k8s-master" do |master|
        master.vm.box = IMAGE_NAME
        master.vm.network "public_network", auto_config: false, ip: "192.168.1.0"
        master.vm.hostname = "k8s-master"
        master.vm.provision "ansible" do |ansible|
            ansible.playbook = "kubernetes-setup/master-playbook.yml"
            ansible.extra_vars = {
                node_ip: "192.168.1.160",
            }
        end
    end

    (1..N).each do |i|
        config.vm.define "node-#{i}" do |node|
            node.vm.box = IMAGE_NAME
            config.vm.network "public_network", auto_config: false, ip: "192.168.1.#{i + 160}"
            node.vm.hostname = "node-#{i}"
            node.vm.provision "ansible" do |ansible|
                ansible.playbook = "kubernetes-setup/node-playbook.yml"
                ansible.extra_vars = {
                    node_ip: "192.168.1.#{i + 161}",
                }
            end
        end
    end
end

and master-playbook.yml :

---
- hosts: k8s-master
  become: true
  tasks:
    - name: Update package cache
      yum:
        name: '*'
        state: latest
      register: yum_update

    - name: Install required packages
      package:
        name: "{{ item }}"
        state: present
      with_items:
        - docker
        - python3

    - name: Start and enable Docker service
      service:
        name: docker
        state: started
        enabled: yes

    - name: Disable SELinux
      selinux:
        state: disabled

    - name: Add Kubernetes Yum repository
      get_url:
        url: "https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg"
        dest: /etc/pki/rpm-gpg/rpm-package-key.gpg

    - name: Add Kubernetes Yum repository (continued)
      get_url:
        url: "https://packages.cloud.google.com/yum/doc/yum-key.gpg"
        dest: /etc/pki/rpm-gpg/yum-key.gpg

    - name: Install Kubernetes components
      package:
        name: "{{ item }}"
        state: present
      with_items:
        - kubelet
        - kubeadm
        - kubectl

    - name: Start and enable kubelet service
      service:
        name: kubelet
        state: started
        enabled: yes

    - name: Initialize Kubernetes cluster
      command: kubeadm init --apiserver-advertise-address="{{ node_ip }}" --pod-network-cidr=192.168.1.0/24
      args:
        creates: /etc/kubernetes/admin.conf
      environment:
        KUBECONFIG: /etc/kubernetes/admin.conf
      register: kubeadm_init_result

    - name: Save the join token
      set_fact:
        join_token: "{{ kubeadm_init_result.stdout_lines[-1] }}"

I'm repeatedly getting errors like below:

An error occurred while executing multiple actions in parallel. Any errors that occurred are shown below.

An error occurred while executing the action on the 'k8s-master' machine. Please handle this error then try again:

The configured network address is not valid within the configured subnet of the defined network. Please update the network settings and try again.

Configured address: 192.168.1.163 Network name: bridge

An error occurred while executing the action on the 'node-3' machine. Please handle this error then try again:

The configured network address is not valid within the configured subnet of the defined network. Please update the network settings and try again.

Configured address: 192.168.1.161 Network name: bridge

These are the prompts and values I provided prior to the error:

mattyp@karuma kubernetes-setup]$ vagrant up
Bringing machine 'k8s-master' up with 'docker' provider...
Bringing machine 'node-1' up with 'docker' provider...
Bringing machine 'node-2' up with 'docker' provider...
Bringing machine 'node-3' up with 'docker' provider...
==> k8s-master: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Gateway IP address for enp10s0 interface [192.168.1.1]: 192.168.1.254    
When an explicit address is not provided to a container attached
to this bridged network, docker will supply an address to the
container. This is independent of the local DHCP service that
may be available on the network.

Available address range for assignment on enp10s0 interface [192.168.1.0/24]: 
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 
1==> k8s-master: An error occurred. The error will be shown after all tasks complete.
==> node-2: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
==> node-3: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? ==> node-2: Creating the container...
    node-2:   Name: kubernetes-setup_node-2_1690074713
    node-2:  Image: docker.io/roboxes/rhel8:4.2.16
    node-2: Volume: /home/mattyp/vagrant/kubernetes-setup:/vagrant
    node-2:  
    node-2: Container created: e15432d76dc89ed3
==> node-2: Enabling network interfaces...
==> node-2: Starting container...

==> node-3: An error occurred. The error will be shown after all tasks complete.
==> node-1: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
==> node-1: Creating the container...
    node-1:   Name: kubernetes-setup_node-1_1690074727
    node-1:  Image: docker.io/roboxes/rhel8:4.2.16
    node-1: Volume: /home/mattyp/vagrant/kubernetes-setup:/vagrant
    node-1:  
    node-1: Container created: 873ff0eb4eafdfd3
==> node-1: Enabling network interfaces...
==> node-1: Starting container...

I know there's fundamentally something I'm doing wrong with the networking, and am at best rusty on networking through vagrant and docker.

Any help in resolving the error and any suggestions regarding doing this "smarter" than I am would be appreciated.

Solution

For Vagrant to successfully start and provision Docker containers, the images need to be explicitly designed for this purpose. In particular:

They need to run an ssh server by default
They need an ssh key configured for authentication as the vagrant user (either the default Vagrant key, or your own in which case you need to configure that explicitly in your Vagrantfile as well).
The vagrant user needs sudo privileges

A minimal example

You're using the roboxes/rhel8 image, which has the necessary packages (e.g., openssh-server) preinstalled. If we build a new image using a Dockerfile like this:

FROM roboxes/rhel8

COPY docker-entrypoint.sh /usr/local/bin/
RUN useradd vagrant && \
    echo 'vagrant:*LOCKED*' | chpasswd -e
RUN mkdir -p /home/vagrant/.ssh && \
    chmod 700 /home/vagrant/.ssh && \
    curl -sSf -o /home/vagrant/.ssh/authorized_keys https://raw.githubusercontent.com/hashicorp/vagrant/master/keys/vagrant.pub && \
    chmod 600 /home/vagrant/.ssh/authorized_keys && \
    chown -R vagrant /home/vagrant
RUN echo 'vagrant ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/vagrant && \
    chmod 440 /etc/sudoers.d/vagrant
RUN printf '#VAGRANT-BEGIN\n#VAGRANT-END\n' >> /etc/fstab
ENTRYPOINT ["sh", "/usr/local/bin/docker-entrypoint.sh"]
CMD ["/usr/sbin/sshd", "-D"]

Where docker-entrypoint.sh is:

#!/bin/sh

ssh-keygen -A
exec "$@"

Then the following Vagrantfile works correctly:

IMAGE_NAME = "rhel8-ssh"
N = 3

$script = <<-SCRIPT
#!/bin/sh

date > /tmp/provisioned
SCRIPT

Vagrant.configure("2") do |config|
    config.vm.provider "docker" do |d|
      d.image = IMAGE_NAME
      d.has_ssh = true
      d.remains_running = true
    end

    config.vm.define "controller" do |controller|
      controller.vm.hostname = "controller"
      controller.vm.provision :shell, inline: $script
      controller.vm.network :public_network,
        ip: "192.168.1.160",
        bridge: "eth0",
        docker_network__ip_range: "192.168.1.160/29",
        docker_network__gateway: "192.168.1.1"
    end

    (1..N).each do |i|
      config.vm.define "node-#{i}" do |node|
        node.vm.hostname = "node-#{i}"
        node.vm.provision :shell, inline: $script
        node.vm.network :public_network,
          ip: "192.168.1.#{160 + i}",
          bridge: "eth0",
          docker_network__ip_range: "192.168.1.160/29",
          docker_network__gateway: "192.168.1.1"
      end
    end
end

Some notes on this configuration:

I'm including the network information directly in the file, rather than having Vagrant prompt me for up at vagrant up time.
I'm using the shell provisioner here just to verify that things are working.

Running vagrant up with this configuration results in all four nodes coming up successfully with the expected addresses.

Using the ansible provisioner

The above works, but you're trying to use the ansible provisioner. We can update the above configuration to use Ansible by modifying it like this:

IMAGE_NAME = "rhel8-ssh"
N = 3

Vagrant.configure("2") do |config|
    config.vm.provider "docker" do |d|
      d.image = IMAGE_NAME
      d.has_ssh = true
      d.remains_running = true
    end

    config.vm.define "controller" do |controller|
      controller.vm.hostname = "controller"
      controller.vm.network :public_network,
        ip: "192.168.1.160",
        bridge: "eth0",
        docker_network__ip_range: "192.168.1.160/29",
        docker_network__gateway: "192.168.1.1"
      controller.vm.provision "ansible" do |ansible|
        ansible.playbook = "controller-playbook.yaml"
      end
    end

    (1..N).each do |i|
      config.vm.define "node-#{i}" do |node|
        node.vm.hostname = "node-#{i}"
        node.vm.network :public_network,
          ip: "192.168.1.#{160 + i}",
          bridge: "eth0",
          docker_network__ip_range: "192.168.1.160/29",
          docker_network__gateway: "192.168.1.1"
        node.vm.provision "ansible" do |ansible|
          ansible.playbook = "node-playbook.yaml"
        end
      end
    end
end

If node-playbook.yaml and controller-playbook.yaml both contain:

- hosts: all
  gather_facts: false
  become: true
  tasks:
    - name: write a flag file
      copy:
        content: this is a test
        dest: /tmp/testfile

Then this all runs correctly as well.

Playbooks problems

But now lets look at your actual playbook.

Package installation issues

In the first section, you're trying to install some packages:

    - name: Update package cache
      yum:
        name: '*'
        state: latest
      register: yum_update

    - name: Install required packages
      package:
        name: "{{ item }}"
        state: present
      with_items:
        - docker
        - python3

The image you're using is based on RHEL8. For system packages, you will either (a) need a Red Hat subscription in order to enable the official repositories, or (b) you will need to use alternative repositories.

For Docker, you would need to enable the Docker repositories.

As currently written, this task will result in:

TASK [Install required packages] ***********************************************
failed: [controller] (item=docker) => {"ansible_loop_var": "item", "changed": false, "failures": ["No package docker available."], "item": "docker", "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}

Inappropriate configuration

Next, attempting to disable SELinux in the container doesn't make any sense -- this is a kernel level configuration that affects the entire host; it cannot be set per-container.

Service management

Your playbook is attempting to start a service using the service task, but inside a container like this there is no service manager running. The service task won't work, nor will command line approaches like systemctl start ....

Additional complications

You're trying to run Docker inside a Docker container. While this is possible, it requires some additional configuration.

Looking at alternatives

If you want to set up a containerized Kubernetes configuration, consider using kind instead, which was designed for exactly this purpose and is substantially easier to set up; the simplest approach is:

kind create cluster

Alternatively, if you're trying to reproduce an existing bare metal kubernetes installation, consider using virtual machines instead of containers, and pick something other than RHEL as your base image unless you have a subscription.

The files referenced in this answer can be found in this repository.