sshmpifirewallmpich

MPICH2 on multiple machines (HYDU_sock_connect error)


I am trying to execute an MPI program in 2 different PCs. However, when I ran this command in pc1:

mpirun -hosts user@host -n 4 bin/Demo_01.exe 

I'm getting this error:

[proxy:0:0@pc2] HYDU_sock_connect (./utils/sock/sock.c:203): unable to connect from "pc2" to "pc1" (Connection refused)

[proxy:0:0@pc2] main (./pm/pmiserv/pmip.c:209): unable to connect to server ubuntu at port 57395 (check for firewalls!)

Although I configured SSH connections as without password and disabled firewalls on each machines, the error is still there. My operating system is Ubuntu 12.04 and mpi is MPICH2.

Is there anyone to help?


Solution

  • Fixed. After I followed these steps, the error disappeared:

    1. Create administrator user accounts in both machines with the same username and password.
    2. Define hostnames by editing the file: /etc/hosts
    3. Make a clean install of ssh in both machines.
    4. Configure ssh for connecting without a password. To do this follow these links: http://www.thegeekstuff.com/2008/11/3-steps-to-perform-ssh-login-without-password-using-ssh-keygen-ssh-copy-id/ and http://dustymabe.com/2012/08/18/exchanging-ssh-keys-using-ssh-copy-id/
    5. Locate the executable MPI program into the same paths in both machines.