dockervagrantapple-m1nomad

How to make 127.0.0.1:4646 accessible in vagrant?


When I tried to build a vagrant in docker by command "vagrant up", below error message displayed:

default: Error submitting job: Put http://127.0.0.1:4646/v1/jobs?region=global: dial tcp 127.0.0.1:4646: connect: connection refused
The SSH command responded with a non-zero exit status. Vagrant
assumes that this means the command failed. The output for this command
should be in the log above. Please read the output to determine what
went wrong.

I tried several methods, but they just can not make 127.0.0.1:4646 accessible. What I wanna do next is to run "vagrant ssh" then run nomad in it, but this issue makes that impossible.

Here are the methods I tried:

  1. Update nomad to 1.2.3;
  2. Use docker with intel chip version(I'm using Apple m1 chip, I tried m1 chip version before);
  3. Export VAGRANT_ADDR http://127.0.0.1:4646 or http://localhost:4646.

So how can I solve this problem? Or do I have another way to make nomad work?

--------Edited First-----

After vagrant ssh, I use nomad agent -dev as @OneCricketeer told, below message displayed,

vagrant@example-app-host:~$ nomad agent -dev
==> No configuration files loaded
==> Starting Nomad agent...
==> Error starting agent: client setup failed: fingerprinting failed: cannot detect cpu total compute. CPU compute must be set manually using the client config option "cpu_total_compute"
    2021-12-24T04:28:25.552Z [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=
    2021-12-24T04:28:25.555Z [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=
    2021-12-24T04:28:25.559Z [INFO ] agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
    2021-12-24T04:28:25.560Z [INFO ] agent: detected plugin: name=java type=driver plugin_version=0.1.0
    2021-12-24T04:28:25.560Z [INFO ] agent: detected plugin: name=docker type=driver plugin_version=0.1.0
    2021-12-24T04:28:25.560Z [INFO ] agent: detected plugin: name=rkt type=driver plugin_version=0.1.0
    2021-12-24T04:28:25.560Z [INFO ] agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
    2021-12-24T04:28:25.560Z [INFO ] agent: detected plugin: name=exec type=driver plugin_version=0.1.0
    2021-12-24T04:28:25.560Z [INFO ] agent: detected plugin: name=nvidia-gpu type=device plugin_version=0.1.0
    2021-12-24T04:28:25.579Z [INFO ] nomad: raft: Initial configuration (index=1): [{Suffrage:Voter ID:127.0.0.1:4647 Address:127.0.0.1:4647}]
    2021-12-24T04:28:25.581Z [INFO ] nomad: raft: Node at 127.0.0.1:4647 [Follower] entering Follower state (Leader: "")
    2021-12-24T04:28:25.596Z [INFO ] nomad: serf: EventMemberJoin: seller-app-host.global 127.0.0.1
    2021-12-24T04:28:25.598Z [INFO ] nomad: starting scheduling worker(s): num_workers=4 schedulers="[service batch system _core]"
    2021-12-24T04:28:25.605Z [DEBUG] nomad: lost contact with Nomad quorum, falling back to Consul for server list
    2021-12-24T04:28:25.606Z [INFO ] nomad: adding server: server="seller-app-host.global (Addr: 127.0.0.1:4647) (DC: dc1)"
    2021-12-24T04:28:25.615Z [INFO ] client: using state directory: state_dir=/tmp/NomadClient853050605
    2021-12-24T04:28:25.616Z [INFO ] client: using alloc directory: alloc_dir=/tmp/NomadClient128700264
    2021-12-24T04:28:25.623Z [DEBUG] nomad: memberlist: Failed to join 172.17.0.2: dial tcp 172.17.0.2:4648: connect: connection refused
    2021-12-24T04:28:25.624Z [ERROR] nomad: error looking up Nomad servers in Consul: error="contacted 0 Nomad Servers: 1 error(s) occurred:

* Failed to join 172.17.0.2: dial tcp 172.17.0.2:4648: connect: connection refused"
    2021-12-24T04:28:25.645Z [DEBUG] client.fingerprint_mgr: built-in fingerprints: fingerprinters="[arch cgroup consul cpu host memory network nomad signal storage vault env_aws env_gce]"
    2021-12-24T04:28:25.647Z [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=cgroup period=15s
    2021-12-24T04:28:25.659Z [INFO ] client.fingerprint_mgr.consul: consul agent is available
    2021-12-24T04:28:25.663Z [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=consul period=15s
    2021-12-24T04:28:25.665Z [DEBUG] client.fingerprint_mgr.cpu: detected core count: cores=4

seems nomad doesn't work correctly.

Then nomad status, that error remains

vagrant@example-app-host:~$ nomad status
Error querying jobs: Get http://127.0.0.1:4646/v1/jobs?region=global: dial tcp 127.0.0.1:4646: connect: connection refused

How can I deal with it?


Solution

  • Seems like you tried to submit a Nomad job before Nomad was started.

    "What I wanna do next is [run nomad]".

    If nomad isn't running yet, then port 4646 isn't accessible. Forwarding a port is already written in the Vagrant documentation, and is not the same as VAGRANT_ADDR, and you should unset that variable.

    You don't need port 4646 to use vagrant ssh


    Regarding provisioning a Nomad cluster, I'd suggest looking into Ansible rather than doing things with vagrant inside of Docker; rather Docker would be installed within the Vagrant box, which would run with Virtualbox, by default


    Or you can run nomad agent -dev on your host without using any VMs