dockerapache-sparkkubernetesnebula-graph

How could I run Spark Utils(exchange, algorithm, spark-connector) towards containerized NebulaGraph


It seems that when the NebulaGraph cluster is deployed in containers(Docker-Compose or K8s), I cannot make Spark Connector read the Graph Data properly anyway.

The mitigation I made was to run Spark inside the container network:

While, this seems not always doable, especially since we'll have spark infra in production cases.

Could anyone explain me what exactly is needed to enable Spark outside of the container network working with NebulaGraph running in containers?


Solution

  • To run Spark Utils (exchange, algorithm, spark-connector) against a containerized NebulaGraph database, you will need to ensure that the Spark Utils can connect to the NebulaGraph cluster through the network.

    If you want to run Spark outside of the container network and connect it to a NebulaGraph cluster running in containers, you will need to expose the NebulaGraph cluster to the external network and ensure that the Spark Utils can connect to it through the network. You can do this by specifying an external network in the networks field of the NebulaGraph cluster's Docker Compose configuration file, or by creating a service with an external load balancer in K8s.

    You will also need to ensure that the necessary ports are open and that the Spark Utils have the correct connection details (e.g. hostname, port, username, password) to connect to the NebulaGraph cluster.