I have a docker compose stack that includes ZooKeeper. It has worked beautifully for years.
zoo:
container_name: zoo
image: public.ecr.aws/docker/library/zookeeper:3.9.3
restart: unless-stopped
stdin_open: true
tty: true
I have Java and Ruby clients connect to ZooKeeper using zoo:2181
as a connection string.
I am now running this same container in AWS ECS. I am using a ServiceConnectConfiguration to make the container discoverable with the name "zoo".
My Ruby client seems to have no issue connecting to ZooKeeper.
My Java client is very unreliable if I use the ServiceConnect name.
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "zoo", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, locks, jobs]
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "zoo", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, locks, jobs]
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "zoo", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
| Exception org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /
| at KeeperException.create (KeeperException.java:101)
| at KeeperException.create (KeeperException.java:53)
| at ZooKeeper.getChildren (ZooKeeper.java:2366)
| at ZooKeeper.getChildren (ZooKeeper.java:2393)
| at (#9:2)
If I use the ip address for "zoo" from /etc/hosts, I have no trouble connecting.
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "127.255.0.10", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, locks, jobs]
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "127.255.0.10", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, locks, jobs]
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "127.255.0.10", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, locks, jobs]
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "127.255.0.10", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, locks, jobs]
I am unsure if I need to fix this in the ServiceConnect configuration or if I can configure my ZooKeeper service to to function more effectively in this environment.
Looking at the services in ECS, there seems to be sufficient memory or CPU for the running tasks.
Forcing traffic to IPv4 seems to have resolved this issue.
bash-4.2# jshell -R -Djava.net.preferIPv4Stack=true
| Welcome to JShell -- Version 21.0.6
| For an introduction type: /help intro
jshell> import org.apache.zookeeper.ZooKeeper;
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "zoo", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, jobs]
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "zoo", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, jobs]
jshell> try (ZooKeeper zk = new ZooKeeper(String.format("%s:%d", "zoo", 2181), 5000, null)){
...> System.out.println(zk.getChildren("/",false));
...> }
[batches, zookeeper, batch-uuids, jobs]