apache-sparkpysparkinfinibandmultihomed

Multihomed Spark Cluster


I am working on setting up a Spark cluster in a multihomed network situation and have run into some problems. I'll start with the physical configuration.

I have 12 nodes all in a rack that have an inter-rack 100G infiniband network using ipoib and a 1G management network.

Spark works great when I run jobs from the master node on the cluster but now I am trying to do jobs from my workstation which is connected to the management network which is where I ran into trouble.

All of the spark nodes have their hosts file point to the infiniband network as I want them to communicate over that network. I had to set SPARK_MASTER_HOST for the master node to 0.0.0.0 in order to even be able to connect to the master from my workstation.

Now I can create a SparkSession and perform operations but it always hangs and when I look at the logs of the workers I see that they are getting a "No route to host" error. It seems that even though the default route on the node is set to the management subnet it is trying to connect back to the client using the infiniband network. (I should point out that I can ping my workstation from all of the clients so I know that the networking route is fine. Also all of the firewalls are off at the moment)

As a side note, because of this setup, the spark master web interface doesn't work very well because all of the links to the workers point to the infiniband IP address so it always fails, but if you just change the IP manually in the address bar to the correct subnet it works. This would be nice to fix as well but its not really that big of a deal.

I tried looking through the spark documentation but I didn't really find anything that looked helpful, I tried playing around with some of the network settings but I haven't had much luck. I have a hard time believing that spark doesn't support having a private network but maybe that is the case.

I appreciate any help or ideas you guys can give me.


Solution

  • I used to face these issues all the time (in the context of InfiniBand as well) and didn't find a proper solution, but rather just a few workarounds. The issue is that Spark does not allow client<->master and master<->workers to connect on different networks.

    Workaround #1: Use YARN cluster-mode Your app will run within the YARN cluster (which I assume is the same as your Spark cluster) thus having access to the InfiniBand network. In this case, you will have to make sure that jobs can be submitted to YARN through the management network. YARN will then start Spark processes that will bind to the InfiniBand network.

    Workaround #2: Proxy Another option (which I didn't try, but should work) is to set up a proxy daemon on the Spark Master node that will relay your management<->IB networks.