cwindowsmpimpich

MPICH: How to publish_name such that a client application can lookup_name it?


While learning MPI using MPICH in windows (1.4.1p1) I found some sample code here. Originally, when I ran the server, I would have to copy the generated port_name and start the client with it. That way, the client can connect to the server. I modified it to include MPI_Publish_name() in the server instead. After launching the server with a name of aaaa, I launch the client which fails MPI_Lookup_name() with

Invalid service name (see MPI_Publish_name), error stack:
MPID_NS_Lookup(87): Lookup failed for service name aaaa

Here are the snipped bits of code:

server.c

MPI_Comm client; 
MPI_Status status; 
char port_name[MPI_MAX_PORT_NAME];
char serv_name[256];
double buf[MAX_DATA]; 
int size, again; 
int res = 0;

MPI_Init( &argc, &argv ); 
MPI_Comm_size(MPI_COMM_WORLD, &size); 
MPI_Open_port(MPI_INFO_NULL, port_name);
sprintf(serv_name, "aaaa");
MPI_Publish_name(serv_name, MPI_INFO_NULL, port_name);

while (1) 
{ 
    MPI_Comm_accept( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &client );
    /*...snip...*/
}

client.c

MPI_Comm server; 
double buf[MAX_DATA]; 
char port_name[MPI_MAX_PORT_NAME]; 
memset(port_name,'\0',MPI_MAX_PORT_NAME);
char serv_name[256];
memset(serv_name,'\0',256);

strcpy(serv_name, argv[1] )
MPI_Lookup_name(serv_name, MPI_INFO_NULL, port_name);
MPI_Comm_connect( port_name, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &server ); 
MPI_Send( buf, 0, MPI_DOUBLE, 0, tag, server ); 
MPI_Comm_disconnect( &server ); 
MPI_Finalize(); 
return 0; 

I cannot find any information about altering visibility of published names, if that is even the problem. MPICH seems to not have implemented anything with MPI_INFO. I would try openMPI but I am having trouble just building it. Any suggestions?


Solution

  • This approach of publishing names, looking them up, and connecting to them is outlandish relative to normal MPI usage.

    The standard pattern is to use mpirun to specify a set of nodes on which to launch a given number of processes. The operation of common implementations of mpirun implementations is explained in another question

    Once the processes are all launched as part of a single parallel job, the MPI library reads whatever information the launcher provided during MPI_Init to set up MPI_COMM_WORLD, a communicator over the group of all processes in the job.

    Using that communicator, the parallel application can distribute work, exchange information, and so forth. It would do this using the common MPI_Send and MPI_Recv routines, in all their variants, the collective operations, and so forth.