amazon-web-serviceselasticsearchdatabrickselasticsearch-hadoop

Do Databricks workers and Elasticsearch nodes need to be in the same VPC in AWS?


I would like to write a dataframe into Elasticsearch from within Databricks.

My Elasticsearch cluster is hosted on AWS and Databricks is spinning up EC2 instances with a certain role. That role has the permission to interact with my Elasticsearch cluster but for some reason, I seem not to be able to even PING the Elasticsearch cluster.

Failed attempt to PING my cluster

Do I need to find a way to squeeze both my Databricks workers and my Elasticsearch cluster into the same VPC? Sounds like a CloudFormation nightmare.


Solution

  • If you've got ES running in another VPC then you'll need either private link or peering to ensure the workers can access it. For isolation and to avoid issues with IP limits for your workers, it would be better to keep ES and DB in different VPCs.