tensorflowmpigrpcinfinibandrdma

Does gRPC+MPI require RDMA?


Tensorflow allows for the options "gRPC", "gRPC+verbs" and "gRPC+mpi" when specifying a communication protocol. In the gRPC+verbs documentation, it clearly states that this protocol is based on RDMA. Meanwhile, in the gRPC+MPI documentation, it doesn't imply this at all, and initially I assumed that gRPC+mpi can run on any underlying network. However, this research paper implies that gRPC+mpi is required to run over RDMA (see the end of page 3). Am I misinterpreting the research paper? Can gRPC+MPI in fact run over any network?


Solution

  • I found the answer, page 4 of the same research paper indicates that the MPI channel is simply capable of supporting RDMA.