cassandradatastax-enterprise

DSE nodes went down with ConfigurationException: Unknown data-center name ... passed to NetworkTopologyStrategy


My 3 cassandra staging nodes went down yesterday without starting as a result of below the logs excerpt in the system.log

INFO  [CoreThread-1] 2023-02-19 16:58:46,595  NodeSyncService.java:381 - Enabled Incremental NodeSync trackers for 10 tables in 394ms
ERROR [DSE main thread] 2023-02-19 16:58:46,697  CassandraDaemon.java:932 - Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: Unknown data-center name 'Staging_cluster' passed to NetworkTopologyStrategy for keyspace 'system_auth': it is either unknown to the configured snitch, or has no active member (known DCs: {Staging_cluste: 3 nodes})
        at org.apache.cassandra.locator.NetworkTopologyStrategy.validateExpectedOptions(NetworkTopologyStrategy.java:280)
        at org.apache.cassandra.locator.AbstractReplicationStrategy.validateReplicationStrategy(AbstractReplicationStrategy.java:337)
        at org.apache.cassandra.schema.ReplicationParams.validate(ReplicationParams.java:94)
        at org.apache.cassandra.schema.KeyspaceMetadata.validate(KeyspaceMetadata.java:97)
        at org.apache.cassandra.schema.KeyspaceMetadata.<init>(KeyspaceMetadata.java:85)
        at org.apache.cassandra.schema.KeyspaceMetadata.create(KeyspaceMetadata.java:167)
        at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:1154)
        at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspaces(SchemaKeyspace.java:1769)
        at org.apache.cassandra.schema.SchemaManager.merge(SchemaManager.java:893)
        at org.apache.cassandra.schema.SchemaManager.mergeAndAnnounceVersion(SchemaManager.java:877)
        at org.apache.cassandra.schema.MigrationManager.lambda$announce$7(MigrationManager.java:350)
        at io.reactivex.internal.operators.completable.CompletableFromRunnable.subscribeActual(CompletableFromRunnable.java:35)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableDefer.subscribeActual(CompletableDefer.java:43)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableConcatIterable$ConcatInnerObserver.next(CompletableConcatIterable.java:119)
        at io.reactivex.internal.operators.completable.CompletableConcatIterable.subscribeActual(CompletableConcatIterable.java:47)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableDefer.subscribeActual(CompletableDefer.java:43)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableAndThenCompletable$SourceObserver.onComplete(CompletableAndThenCompletable.java:67)
        at io.reactivex.internal.operators.completable.CompletableAndThenCompletable$NextObserver.onComplete(CompletableAndThenCompletable.java:99)
        at io.reactivex.internal.disposables.EmptyDisposable.complete(EmptyDisposable.java:68)
        at io.reactivex.internal.operators.completable.CompletableEmpty.subscribeActual(CompletableEmpty.java:27)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableDefer.subscribeActual(CompletableDefer.java:43)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableAndThenCompletable$SourceObserver.onComplete(CompletableAndThenCompletable.java:67)
        at io.reactivex.internal.disposables.EmptyDisposable.complete(EmptyDisposable.java:68)
        at io.reactivex.internal.operators.completable.CompletableEmpty.subscribeActual(CompletableEmpty.java:27)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableAndThenCompletable.subscribeActual(CompletableAndThenCompletable.java:35)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletableAndThenCompletable.subscribeActual(CompletableAndThenCompletable.java:35)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.internal.operators.completable.CompletablePeek.subscribeActual(CompletablePeek.java:51)
        at io.reactivex.Completable.subscribe(Completable.java:2302)
        at io.reactivex.Completable.blockingAwait(Completable.java:1219)
        at org.apache.cassandra.concurrent.TPCUtils.blockingAwait(TPCUtils.java:87)
        at org.apache.cassandra.service.StorageService.finishJoiningRing(StorageService.java:1580)
        at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1456)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:933)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:852)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:419)
        at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:541)
        at org.apache.cassandra.service.CassandraDaemon.activate0(CassandraDaemon.java:754)
        at org.apache.cassandra.service.CassandraDaemon.access$100(CassandraDaemon.java:88)
        at org.apache.cassandra.service.CassandraDaemon$3.run(CassandraDaemon.java:715)
INFO  [GossipStage:1] 2023-02-19 16:58:51,490  Gossiper.java:1304 - InetAddress /10.**.***.*** is now DOWN
INFO  [GossipStage:1] 2023-02-19 16:58:51,494  Gossiper.java:1349 - WRITING LOCAL JOIN INFO to [com.datastax.bdp.util.Addresses$Internode$AddressCacheManager@49298fe6, org.apache.cassandra.service.disk.usage.DiskUsageBroadcaster@131d9092, org.apache.cassandra.gms.Gossiper$2@a50eda0, org.apache.cassandra.service.StorageService@49544ce8, org.apache.cassandra.locator.ReconnectableSnitchHelper@4c87bc9b, org.apache.cassandra.service.LoadBroadcaster@33e12c99]
WARN  [GossipStage:1] 2023-02-19 16:58:51,532  NoSpamLogger.java:98 - Cannot answer echo request because this node is not yet initialized.
WARN  [GossipTasks:1] 2023-02-19 16:58:52,482  FailureDetector.java:294 - Not marking nodes down due to local pause of 70099399166 > 5000000000
INFO  [StorageServiceShutdownHook] 2023-02-19 16:58:56,710  DseDaemon.java:886 - DSE shutting down...

Kindly assist with a quick fix.


Solution

  • This is your problem:

    ConfigurationException: Unknown data-center name 'Staging_cluster' passed
    to NetworkTopologyStrategy for keyspace 'system_auth': it is either 
    unknown to the configured snitch, or has no active member (known DCs: 
    {Staging_cluste: 3 nodes})
    

    Basically, the data center name defined in your system_auth keyspace definition does not match the data center name for any node in the cluster.

    If you're using the GossipingPropertyFileSnitch check the cassandra-rackdc.properties file for the correct data center name.