dockeraerospikeaerospike-ce

Unable to create aerospike cluster of 2 nodes using docker image


I am currently experimenting with the Aerospike Docker image (aerospike/aerospike-server) and I’m facing difficulties in setting up a simple 2-node cluster on my Mac. I am using Aerospike Community Edition build 6.3.0.2. To recreate the issue, please follow these commands:

  1. Create a Docker network named “aerospike-network”:

    docker network create aerospike-network
    
  2. Run the first Aerospike node container:

    docker run -d --name aerospike-node1 --network aerospike-network aerospike/aerospike-server
    
  3. Run the second Aerospike node container

    docker run -d --name aerospike-node2 --network aerospike-network aerospike/aerospike-server
    

Next, you need to modify the configuration of both Docker containers. Here are the updated configurations: Aerospike Node-1 Configuration.

Aerospike Node-1 Configuration:

service {
}

logging {
    console {
        context any info
    }
}

network {
    service {
        address any
        port 3000
    }

    heartbeat {
        mode mesh
        address any
        port 3002
        mesh-seed-address-port aerospike-node1 3002
        mesh-seed-address-port aerospike-node2 3002
        interval 150
        timeout 10
    }

    fabric {
        address local
        port 3001
    }
}

namespace test {
    replication-factor 2
    memory-size 1G
    default-ttl 30d
    storage-engine device {
        file /opt/aerospike/data/test.dat
        filesize 4G
        data-in-memory false
        write-block-size 128K
    }
}

Aerospike Node-2 Configuration:

service {
}

logging {
    console {
        context any info
    }
}

network {
    service {
        address any
        port 3000
    }

    heartbeat {
        mode mesh
        address any
        port 3002
        mesh-seed-address-port aerospike-node1 3002
        mesh-seed-address-port aerospike-node2 3002
        interval 150
        timeout 10
    }

    fabric {
        address local
        port 3001
    }
}

namespace test {
    replication-factor 2
    memory-size 1G
    default-ttl 30d
    storage-engine device {
        file /opt/aerospike/data/test.dat
        filesize 4G
        data-in-memory false
        write-block-size 128K
    }
}

After modifying the configurations, please restart both Aerospike containers.

Here is the log information you provided:

May 24 2023 18:30:23 GMT: INFO (as): (as.c:382) initializing services...
May 24 2023 18:30:23 GMT: INFO (service): (service.c:167) starting 10 service threads
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:6793) added new mesh seed aerospike-node1:3002
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:6793) added new mesh seed aerospike-node2:3002
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:791) updated fabric published address list to {127.0.0.1:3001}
May 24 2023 18:30:23 GMT: INFO (partition): (partition_balance.c:203) {test} 4096 partitions: found 0 absent, 4096 stored
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:5523) updated heartbeat published address list to {10.0.4.101:3002}
May 24 2023 18:30:23 GMT: INFO (smd): (smd.c:2342) no file '/opt/aerospike/smd/UDF.smd' - starting empty
May 24 2023 18:30:23 GMT: INFO (batch): (batch.c:814) starting 2 batch-index-threads
May 24 2023 18:30:23 GMT: INFO (health): (health.c:318) starting health monitor thread
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:416) starting 8 fabric send threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 16 fabric rw channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 4 fabric ctrl channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 4 fabric bulk channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:430) starting 4 fabric meta channel recv threads
May 24 2023 18:30:23 GMT: INFO (fabric): (fabric.c:442) starting fabric accept thread
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:6978) initializing mesh heartbeat socket: 10.0.4.101:3002
May 24 2023 18:30:23 GMT: INFO (fabric): (socket.c:818) Started fabric endpoint 127.0.0.1:3001
May 24 2023 18:30:23 GMT: INFO (hb): (hb.c:7008) mtu of the network is 1450
May 24 2023 18:30:23 GMT: INFO (hb): (socket.c:818) Started mesh heartbeat endpoint 10.0.4.101:3002
May 24 2023 18:30:23 GMT: INFO (nsup): (nsup.c:197) starting namespace supervisor threads
May 24 2023 18:30:23 GMT: INFO (service): (service.c:942) starting reaper thread
May 24 2023 18:30:23 GMT: INFO (service): (socket.c:818) Started client endpoint 0.0.0.0:3000
May 24 2023 18:30:23 GMT: INFO (service): (service.c:199) starting accept thread
May 24 2023 18:30:23 GMT: INFO (as): (as.c:421) service ready: soon there will be cake!
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:6344) removing self seed entry host:aerospike-node1 port:3002
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:6832) removed mesh seed host:aerospike-node2 port 3002
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:4376) found redundant connections to same node (bb96504000a4202) - choosing at random
May 24 2023 18:30:24 GMT: INFO (hb): (hb.c:8581) node arrived bb96404000a4202
May 24 2023 18:30:24 GMT: INFO (fabric): (fabric.c:2580) fabric: node bb96404000a4202 arrived
May 24 2023 18:30:25 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:25 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:27 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:27 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:29 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:29 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:31 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:31 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:32 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:32 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:168) NODE-ID bb96504000a4202 CLUSTER-SIZE 0 CLUSTER-NAME null
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:242)    cluster-clock: skew-ms 0
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:263)    system: total-cpu-pct 5 user-cpu-pct 3 kernel-cpu-pct 2 free-mem-kbytes 3448904 free-mem-pct 85 thp-mem-kbytes 8192
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:285)    process: cpu-pct 2 threads (9,60,29,29) heap-kbytes (1141700,1142268,1182208) heap-efficiency-pct 100.0
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:295)    in-progress: info-q 0 rw-hash 0 proxy-hash 0 tree-gc-q 0 long-queries 0
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:319)    fds: proto (0,0,0) heartbeat (1,3,2) fabric (24,24,0)
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:328)    heartbeat-received: self 2 foreign 65
May 24 2023 18:30:33 GMT: INFO (info): (ticker.c:354)    fabric-bytes-per-second: bulk (4,4) ctrl (2,2) meta (2,2) rw (19,19)
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:413) {test} objects: all 0 master 0 prole 0 non-replica 0
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:477) {test} migrations: complete
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:504) {test} memory-usage: total-bytes 0 index-bytes 0 set-index-bytes 0 sindex-bytes 0 used-pct 0.00
May 24 2023 18:30:34 GMT: INFO (info): (ticker.c:586) {test} device-usage: used-bytes 0 avail-pct 99 cache-read-pct 0.00
May 24 2023 18:30:34 GMT: INFO (hb): (hb.c:4376) (repeated:1) found redundant connections to same node (bb96504000a4202) - choosing at random
May 24 2023 18:30:34 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:34 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:36 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:36 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:38 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:38 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)
May 24 2023 18:30:40 GMT: INFO (clustering): (clustering.c:6313) neighboring orphans for cluster formation: bb96404000a4202
May 24 2023 18:30:40 GMT: INFO (clustering): (clustering.c:6339) skipping forming cluster - cannot form new cluster from pending join requests (empty)

I have also checked the netstats but found nothing wrong.

Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.11:42165        0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:3000            0.0.0.0:*               LISTEN      7/asd
tcp        0      0 127.0.0.1:3001          0.0.0.0:*               LISTEN      7/asd
tcp        0      0 10.0.4.101:3002         0.0.0.0:*               LISTEN      7/asd

Telnet Stats

root@294ae2e0b620:/# telnet aerospike-node2 3000
Trying 10.0.4.100...
Connected to aerospike-node2.
Escape character is '^]'.
^C^CConnection closed by foreign host.

root@294ae2e0b620:/# telnet aerospike-node2 3001
Trying 10.0.4.100...
telnet: Unable to connect to remote host: Connection refused

root@294ae2e0b620:/# telnet aerospike-node2 3002
Trying 10.0.4.100...
Connected to aerospike-node2.
Escape character is '^]'.
�hd
   �d
   |p
     �OtConnection closed by foreign host.

Based on this information, could you please assist me in identifying the cause of the issue and guide me on how to successfully form the cluster? Any suggestions or recommendations would be highly appreciated.


Solution

  • Did you have instructions about setting the fabric address to local? This is likely preventing proper communication over fabric. If you specify a network interface, it has to be able to communicate with the other node over the configured port (3001 in this case).

    You may also want to check https://medium.com/aerospike-developer-blog/how-do-i-get-a-2-node-aerospike-cluster-running-quickly-in-docker-without-editing-a-single-file-1c2a94564a99