Our current Datastax datacenter setup contain 6 nodes in which both Solr and graph enabled
root@ip-10-10-5-36:~# cat /etc/default/dse | grep -E 'SOLR_ENABLED|GRAPH_ENABLED'
GRAPH_ENABLED=1
SOLR_ENABLED=1
root@ip-10-10-5-36:~# nodetool status
Datacenter: SearchGraph
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.10.5.56 456.58 MiB 1 ? 936a1ac0-6d5e-4a94-8953-d5b5a2016b92 rack1
UN 10.10.5.46 406.24 MiB 1 ? 3f41dc2a-2672-47a1-90b5-a7c2bf17fb50 rack1
UN 10.10.5.76 392.99 MiB 1 ? 29f8fe44-3431-465e-b682-5d24e37d41d7 rack2
UN 10.10.5.66 414.16 MiB 1 ? 1f7de531-ff51-4581-bdb8-d9a686f1099e rack2
UN 10.10.5.86 424.3 MiB 1 ? 27d37833-56c8-44bd-bac0-7511b8bd74e8 rack2
UN 10.10.5.36 511.44 MiB 1 ? 0822145f-4225-4ad3-b2be-c995cc230830 rack1
We are planning to implement spark in our existing datacenter. My question is
1) Will enabling spark affect existing data and service in datastax ?.
2) Or instead of enabling SPARK_ENABLED=1, did we need to setup separate datacenter for Spark ?
Updated :
3) How DC1 and DC2 connect each other in ring, is it based on same Cluster name specified in cluster_name: parameter. Conf file : /etc/dse/cassandra/cassandra.yaml
4) Is there any separate configuration need to specify spark master in data
center.
5) Did i need to specify SearchGraph (DC1) seed ip in Spark(DC2) seed
configuration section ? Or just Spark seed ip only need to specify in DC2
Configuration section(cassandra:yaml)
It's recommended to create separate datacenter for DSE Analytics. The full process is described in documentation.