amazon-ec2cassandracqlsh

CQLSH connection refused on EC2 Cassandra cluster nodes


I am trying to set up a Cassandra cluter on four EC2 t2.2xlarge nodes, with one node nominated as the seed. The cluster seems to have started on each node. However, when I try to run /opt/cassandra/bin/cqlsh I get the following error:

Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})

When I do a netstat on 9042 on the seed node, I get the following output:

Proto  Recv-Q  Send-Q  Local Address                Foreign Address             State
tcp         0       0  ip-172-xx-xx-111.eu-wes:9042  *:*                         LISTEN

I'm thinking that this host address could be the source of the problem, but don't know how it wouldve got set the that, or how to change it. Should it be 127.0.0.1 or localhost?

I have a security group setup with the following information for port 9042:

Type               Protocol    Port Range    Source
-------------------------------------------------------------------------
Custom TCP Rule    TCP         9042          sg-<group-id> (<group-name>)

Perhaps the source here is at fault? Should this be localhost or something?

Below are the values in cassandra.yaml that I have changed on each node:

listen_interface: eth0

broadcast_address: <local-PRIVATE-ip>

rpc_address: <local-PRIVATE-ip>

seed_provider:
# Addresses of hosts that are deemed contact points.
# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring.  You must change this if you are running
# multiple nodes!
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
  parameters:
      # seeds is actually a comma-delimited list of addresses.
      # Ex: "<ip1>,<ip2>,<ip3>"
      - seeds: "<seed-node-PRIVATE-ip>"

When I start each node the final messages in the logs is:

INFO  11:32:44 Node /172.xx.xx.222 state jump to NORMAL
INFO  11:32:44 Waiting for gossip to settle before accepting client requests...
INFO  11:32:44 Compacted 4 sstables to [/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-13,].  11,190 bytes to 5,773 (~51% of original) in 24ms = 0.229398MB/s.  4 total partitions merged to 1.  Partition merge counts were {4:1, }
INFO  11:32:52 No gossip backlog; proceeding

The final few lines of the seed nodes logs are:

INFO  11:58:35 Enqueuing flush of local: 578 (0%) on-heap, 0 (0%) off-heap
INFO  11:58:35 Writing Memtable-local@1006553205(0.081KiB serialized bytes, 4 ops, 0%/0% of on/off-heap limit)
INFO  11:58:35 Completed flushing /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-tmp-ka-14-Data.db (0.000KiB) for commitlog position ReplayPosition(segmentId=1550836714360, position=94125)
INFO  11:58:35 Handshaking version with /172.xx.xx.222
INFO  11:58:35 Node /172.xx.xx.333 has restarted, now UP
INFO  11:58:35 Handshaking version with /172.xx.xx.333
INFO  11:58:35 Node /172.xx.xx.333 state jump to NORMAL
INFO  11:58:35 Enqueuing flush of local: 51462 (0%) on-heap, 0 (0%) off-heap
INFO  11:58:35 Writing Memtable-local@961534831(8.349KiB serialized bytes, 259 ops, 0%/0% of on/off-heap limit)
INFO  11:58:35 Completed flushing /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-tmp-ka-15-Data.db (0.000KiB) for commitlog position ReplayPosition(segmentId=1550836714360, position=106779)
INFO  11:58:35 InetAddress /172.xx.xx.333 is now UP
INFO  11:58:35 Node /172.xx.xx.111 state jump to NORMAL
INFO  11:58:35 Updating topology for /172.xx.xx.333
INFO  11:58:35 Updating topology for /172.xx.xx.333
INFO  11:58:35 Node /172.xx.xx.444 has restarted, now UP
INFO  11:58:35 Waiting for gossip to settle before accepting client requests...
INFO  11:58:35 Node /172.xx.xx.444 state jump to NORMAL
INFO  11:58:35 Handshaking version with /172.xx.xx.444
INFO  11:58:35 InetAddress /172.xx.xx.444 is now UP
INFO  11:58:35 Updating topology for /172.xx.xx.444
INFO  11:58:35 Updating topology for /172.xx.xx.444
INFO  11:58:35 Node /172.xx.xx.222 has restarted, now UP
INFO  11:58:35 Node /172.xx.xx.222 state jump to NORMAL
INFO  11:58:35 InetAddress /172.xx.xx.222 is now UP
INFO  11:58:35 Updating topology for /172.xx.xx.222
INFO  11:58:35 Updating topology for /172.xx.xx.222
INFO  11:58:38 Updating topology for all endpoints that have changed
INFO  11:58:43 No gossip backlog; proceeding

So each of the IPs of the other non-seed nodes (172.xx.xx.222/333/444) seems to be reported as UP. The seed node (172.xx.xx.111) just reports as state jump to NORMAL.


Solution

  • Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
    

    It looks like you're trying to connect via CQLSH to 127.0.0.1, which does not work in a plural node cluster. Specify the exact (broadcast) IP, with your credentials, and it should let you in.

    Ex:

    $ grep _address conf/cassandra.yaml | grep -v "#"
    
    listen_address: 192.168.1.4
    broadcast_address: 10.1.3.6
    rpc_address: 192.168.1.4
    broadcast_rpc_address: 10.1.3.6
    
    $ bin/cqlsh 10.1.3.6 -u flynn -p reindeerFlotilla
    
    Connected to AaronTest at 10.1.3.6:9042.
    [cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
    Use HELP for help.
    flynn@cqlsh>