google-cloud-platformmpigoogle-compute-enginempich

Host key verification failed in google compute engine based mpich cluster


TLDR:

I have 2 google compute engine instances, I've installed mpich on both. When I try to run a sample I get Host key verification failed.

Detailed version:

I've followed this tutorial in order to get this task done: http://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/.

I have 2 google compute engine vms with ubuntu 14.04 (the google cloud account is a trial one, btw). I've downloaded this version of mpich on both instances: http://www.mpich.org/static/downloads/3.3rc1 /mpich-3.3rc1.tar.gz and I installed it using these steps:

./configure --disable-fortran
sudo make
sudo make install

This is the way the /etc/hosts file looks on the master-node:

127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
169.254.169.254 metadata.google.internal metadata
10.128.0.3 client
10.128.0.2 master
10.128.0.2 linux1.us-central1-c.c.ultimate-triode-161918.internal linux
1  # Added by Google
169.254.169.254 metadata.google.internal  # Added by Google

And this is the way the /etc/hosts file looks on the client-node:

127.0.0.1 localhost
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
169.254.169.254 metadata.google.internal metadata
10.128.0.2 master
10.128.0.3 client
10.128.0.3 linux2.us-central1-c.c.ultimate-triode-161918.internal linux
2  # Added by Google
169.254.169.254 metadata.google.internal  # Added by Google

The rest of the steps involved adding an user named mpiuser on both nodes and configuring passwordless ssh authentication between the nodes. And configuring a cloud shared directory between nodes.

The configuration worked till this point. I've downloaded this file https://raw.githubusercontent.com/pmodels/mpich/master/examples/cpi.c to /home/mpiuser/cloud/mpi_sample.c, compiled it this way:

mpicc -o mpi_sample mpi_sample.c

and issued this command on the master node while logged in as the mpiuser:

 mpirun -np 2 -hosts client,master ./mpi_sample

and I got this error:

Host key verification failed.

What's wrong? I've tried to troubleshoot this problem over more than 2 days but I can't get a valid solution.

enter image description here


Solution

  • Add

    package-lock.json
    

    in ".gcloudignore file".

    And deploy it again.