dockervagrantvagrantfile

Vagrantfile docker provider network errors


I'm attempting to use the following files to create 3 k8 nodes using the docker provider and ansible.

Vagrantfile:

IMAGE_NAME = "roboxes/rhel8"
N = 3

Vagrant.configure("2") do |config|
    config.ssh.insert_key = false

    config.vm.provider "docker" do |v|
    end
      
    config.vm.define "k8s-master" do |master|
        master.vm.box = IMAGE_NAME
        master.vm.network "public_network", auto_config: false, ip: "192.168.1.0"
        master.vm.hostname = "k8s-master"
        master.vm.provision "ansible" do |ansible|
            ansible.playbook = "kubernetes-setup/master-playbook.yml"
            ansible.extra_vars = {
                node_ip: "192.168.1.160",
            }
        end
    end

    (1..N).each do |i|
        config.vm.define "node-#{i}" do |node|
            node.vm.box = IMAGE_NAME
            config.vm.network "public_network", auto_config: false, ip: "192.168.1.#{i + 160}"
            node.vm.hostname = "node-#{i}"
            node.vm.provision "ansible" do |ansible|
                ansible.playbook = "kubernetes-setup/node-playbook.yml"
                ansible.extra_vars = {
                    node_ip: "192.168.1.#{i + 161}",
                }
            end
        end
    end
end

and master-playbook.yml :

---
- hosts: k8s-master
  become: true
  tasks:
    - name: Update package cache
      yum:
        name: '*'
        state: latest
      register: yum_update

    - name: Install required packages
      package:
        name: "{{ item }}"
        state: present
      with_items:
        - docker
        - python3

    - name: Start and enable Docker service
      service:
        name: docker
        state: started
        enabled: yes

    - name: Disable SELinux
      selinux:
        state: disabled

    - name: Add Kubernetes Yum repository
      get_url:
        url: "https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg"
        dest: /etc/pki/rpm-gpg/rpm-package-key.gpg

    - name: Add Kubernetes Yum repository (continued)
      get_url:
        url: "https://packages.cloud.google.com/yum/doc/yum-key.gpg"
        dest: /etc/pki/rpm-gpg/yum-key.gpg

    - name: Install Kubernetes components
      package:
        name: "{{ item }}"
        state: present
      with_items:
        - kubelet
        - kubeadm
        - kubectl

    - name: Start and enable kubelet service
      service:
        name: kubelet
        state: started
        enabled: yes

    - name: Initialize Kubernetes cluster
      command: kubeadm init --apiserver-advertise-address="{{ node_ip }}" --pod-network-cidr=192.168.1.0/24
      args:
        creates: /etc/kubernetes/admin.conf
      environment:
        KUBECONFIG: /etc/kubernetes/admin.conf
      register: kubeadm_init_result

    - name: Save the join token
      set_fact:
        join_token: "{{ kubeadm_init_result.stdout_lines[-1] }}"

I'm repeatedly getting errors like below:

An error occurred while executing multiple actions in parallel. Any errors that occurred are shown below.

An error occurred while executing the action on the 'k8s-master' machine. Please handle this error then try again:

The configured network address is not valid within the configured subnet of the defined network. Please update the network settings and try again.

Configured address: 192.168.1.163 Network name: bridge

An error occurred while executing the action on the 'node-3' machine. Please handle this error then try again:

The configured network address is not valid within the configured subnet of the defined network. Please update the network settings and try again.

Configured address: 192.168.1.161 Network name: bridge

These are the prompts and values I provided prior to the error:

mattyp@karuma kubernetes-setup]$ vagrant up
Bringing machine 'k8s-master' up with 'docker' provider...
Bringing machine 'node-1' up with 'docker' provider...
Bringing machine 'node-2' up with 'docker' provider...
Bringing machine 'node-3' up with 'docker' provider...
==> k8s-master: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Gateway IP address for enp10s0 interface [192.168.1.1]: 192.168.1.254    
When an explicit address is not provided to a container attached
to this bridged network, docker will supply an address to the
container. This is independent of the local DHCP service that
may be available on the network.

Available address range for assignment on enp10s0 interface [192.168.1.0/24]: 
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 
1==> k8s-master: An error occurred. The error will be shown after all tasks complete.
==> node-2: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
==> node-3: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? ==> node-2: Creating the container...
    node-2:   Name: kubernetes-setup_node-2_1690074713
    node-2:  Image: docker.io/roboxes/rhel8:4.2.16
    node-2: Volume: /home/mattyp/vagrant/kubernetes-setup:/vagrant
    node-2:  
    node-2: Container created: e15432d76dc89ed3
==> node-2: Enabling network interfaces...
==> node-2: Starting container...

==> node-3: An error occurred. The error will be shown after all tasks complete.
==> node-1: Creating and configuring docker networks...
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
Available bridged network interfaces:
1) enp10s0
2) docker0
When choosing an interface, it is usually the one that is
being used to connect to the internet.

Which interface should the network bridge to? 1
==> node-1: Creating the container...
    node-1:   Name: kubernetes-setup_node-1_1690074727
    node-1:  Image: docker.io/roboxes/rhel8:4.2.16
    node-1: Volume: /home/mattyp/vagrant/kubernetes-setup:/vagrant
    node-1:  
    node-1: Container created: 873ff0eb4eafdfd3
==> node-1: Enabling network interfaces...
==> node-1: Starting container...

I know there's fundamentally something I'm doing wrong with the networking, and am at best rusty on networking through vagrant and docker.

Any help in resolving the error and any suggestions regarding doing this "smarter" than I am would be appreciated.


Solution

  • For Vagrant to successfully start and provision Docker containers, the images need to be explicitly designed for this purpose. In particular:

    A minimal example

    You're using the roboxes/rhel8 image, which has the necessary packages (e.g., openssh-server) preinstalled. If we build a new image using a Dockerfile like this:

    FROM roboxes/rhel8
    
    COPY docker-entrypoint.sh /usr/local/bin/
    RUN useradd vagrant && \
        echo 'vagrant:*LOCKED*' | chpasswd -e
    RUN mkdir -p /home/vagrant/.ssh && \
        chmod 700 /home/vagrant/.ssh && \
        curl -sSf -o /home/vagrant/.ssh/authorized_keys https://raw.githubusercontent.com/hashicorp/vagrant/master/keys/vagrant.pub && \
        chmod 600 /home/vagrant/.ssh/authorized_keys && \
        chown -R vagrant /home/vagrant
    RUN echo 'vagrant ALL=(ALL) NOPASSWD: ALL' > /etc/sudoers.d/vagrant && \
        chmod 440 /etc/sudoers.d/vagrant
    RUN printf '#VAGRANT-BEGIN\n#VAGRANT-END\n' >> /etc/fstab
    ENTRYPOINT ["sh", "/usr/local/bin/docker-entrypoint.sh"]
    CMD ["/usr/sbin/sshd", "-D"]
    

    Where docker-entrypoint.sh is:

    #!/bin/sh
    
    ssh-keygen -A
    exec "$@"
    

    Then the following Vagrantfile works correctly:

    IMAGE_NAME = "rhel8-ssh"
    N = 3
    
    $script = <<-SCRIPT
    #!/bin/sh
    
    date > /tmp/provisioned
    SCRIPT
    
    Vagrant.configure("2") do |config|
        config.vm.provider "docker" do |d|
          d.image = IMAGE_NAME
          d.has_ssh = true
          d.remains_running = true
        end
    
        config.vm.define "controller" do |controller|
          controller.vm.hostname = "controller"
          controller.vm.provision :shell, inline: $script
          controller.vm.network :public_network,
            ip: "192.168.1.160",
            bridge: "eth0",
            docker_network__ip_range: "192.168.1.160/29",
            docker_network__gateway: "192.168.1.1"
        end
    
        (1..N).each do |i|
          config.vm.define "node-#{i}" do |node|
            node.vm.hostname = "node-#{i}"
            node.vm.provision :shell, inline: $script
            node.vm.network :public_network,
              ip: "192.168.1.#{160 + i}",
              bridge: "eth0",
              docker_network__ip_range: "192.168.1.160/29",
              docker_network__gateway: "192.168.1.1"
          end
        end
    end
    

    Some notes on this configuration:

    Running vagrant up with this configuration results in all four nodes coming up successfully with the expected addresses.

    Using the ansible provisioner

    The above works, but you're trying to use the ansible provisioner. We can update the above configuration to use Ansible by modifying it like this:

    IMAGE_NAME = "rhel8-ssh"
    N = 3
    
    Vagrant.configure("2") do |config|
        config.vm.provider "docker" do |d|
          d.image = IMAGE_NAME
          d.has_ssh = true
          d.remains_running = true
        end
    
        config.vm.define "controller" do |controller|
          controller.vm.hostname = "controller"
          controller.vm.network :public_network,
            ip: "192.168.1.160",
            bridge: "eth0",
            docker_network__ip_range: "192.168.1.160/29",
            docker_network__gateway: "192.168.1.1"
          controller.vm.provision "ansible" do |ansible|
            ansible.playbook = "controller-playbook.yaml"
          end
        end
    
        (1..N).each do |i|
          config.vm.define "node-#{i}" do |node|
            node.vm.hostname = "node-#{i}"
            node.vm.network :public_network,
              ip: "192.168.1.#{160 + i}",
              bridge: "eth0",
              docker_network__ip_range: "192.168.1.160/29",
              docker_network__gateway: "192.168.1.1"
            node.vm.provision "ansible" do |ansible|
              ansible.playbook = "node-playbook.yaml"
            end
          end
        end
    end
    

    If node-playbook.yaml and controller-playbook.yaml both contain:

    - hosts: all
      gather_facts: false
      become: true
      tasks:
        - name: write a flag file
          copy:
            content: this is a test
            dest: /tmp/testfile
    

    Then this all runs correctly as well.

    Playbooks problems

    But now lets look at your actual playbook.

    Package installation issues

    In the first section, you're trying to install some packages:

        - name: Update package cache
          yum:
            name: '*'
            state: latest
          register: yum_update
    
        - name: Install required packages
          package:
            name: "{{ item }}"
            state: present
          with_items:
            - docker
            - python3
    

    The image you're using is based on RHEL8. For system packages, you will either (a) need a Red Hat subscription in order to enable the official repositories, or (b) you will need to use alternative repositories.

    For Docker, you would need to enable the Docker repositories.

    As currently written, this task will result in:

    TASK [Install required packages] ***********************************************
    failed: [controller] (item=docker) => {"ansible_loop_var": "item", "changed": false, "failures": ["No package docker available."], "item": "docker", "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}
    

    Inappropriate configuration

    Next, attempting to disable SELinux in the container doesn't make any sense -- this is a kernel level configuration that affects the entire host; it cannot be set per-container.

    Service management

    Your playbook is attempting to start a service using the service task, but inside a container like this there is no service manager running. The service task won't work, nor will command line approaches like systemctl start ....

    Additional complications

    You're trying to run Docker inside a Docker container. While this is possible, it requires some additional configuration.

    Looking at alternatives

    If you want to set up a containerized Kubernetes configuration, consider using kind instead, which was designed for exactly this purpose and is substantially easier to set up; the simplest approach is:

    kind create cluster
    

    Alternatively, if you're trying to reproduce an existing bare metal kubernetes installation, consider using virtual machines instead of containers, and pick something other than RHEL as your base image unless you have a subscription.


    The files referenced in this answer can be found in this repository.