I am trying to run Kafka with docker-compose. I got this yml file:
version: '3'
services:
zookeeper:
image: ${REPOSITORY}/cp-zookeeper:${TAG}
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
volumes:
- ./zoo:/var/lib/zookeeper
broker:
image: ${REPOSITORY}/cp-kafka:${TAG}
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "29092:29092"
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
volumes:
- ./broker:/var/lib/kafka
I ran in the directory with docker-compose.yml file a command:
docker-compose up -d
After that folders ./broker
and ./zoo
appear in my directory. Inside they have a structure like inside the containers (./zoo/data
, ./broker/data
). But there are no files in the dirs.
I tried
docker-compose exec broker ls /var/lib/kafka/data
and I saw folders and files about default topics
This comes down to the interaction between volumes (as declared in the Dockerfile), and the volume that you are trying to mount as part of the Docker Compose.
If you inspect each container's Dockerfile, you'll see that it has volumes declared, which you can also see from inspecting it. Here's what it looks like when using your configuration:
➜ docker inspect zookeeper|jq '.[].Mounts[] | .Type ,.Destination'
"volume"
"/etc/zookeeper/secrets"
"bind"
"/var/lib/zookeeper"
"volume"
"/var/lib/zookeeper/log"
"volume"
"/var/lib/zookeeper/data"
You'll notice that there are two volumes (which are declared in the image itself, i.e. from the Dockerfile) against the specific data paths for ZK
/var/lib/zookeeper/log
/var/lib/zookeeper/data
In addition, there is the bind mount from the Docker Compose:
/var/lib/zookeeper/
These clash, which explains the problem you're seeing.
A similar pattern exists for the broker.
So in short, you need to mount a local host directory per specific volume in the image:
---
version: '3'
services:
zookeeper:
image: confluentinc/cp-zookeeper:5.4.1
hostname: zookeeper
container_name: zookeeper
ports:
- "2181:2181"
environment:
ZOOKEEPER_CLIENT_PORT: 2181
ZOOKEEPER_TICK_TIME: 2000
volumes:
- ./zoo/data:/var/lib/zookeeper/data
- ./zoo/log:/var/lib/zookeeper/log
broker:
image: confluentinc/cp-kafka:5.4.1
hostname: broker
container_name: broker
depends_on:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_BROKER_ID: 1
KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
volumes:
- ./broker/data:/var/lib/kafka/data
With this done, we can see there's no conflicts in the container paths:
➜ docker inspect zookeeper|jq '.[].Mounts '
[
{
"Type": "bind",
"Source": "/private/tmp/zoo/log",
"Destination": "/var/lib/zookeeper/log",
"Mode": "rw",
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "bind",
"Source": "/private/tmp/zoo/data",
"Destination": "/var/lib/zookeeper/data",
"Mode": "rw",
"RW": true,
"Propagation": "rprivate"
},
{
"Type": "volume",
"Name": "6cbb584e0d9aa2f119869b264544f587909d9f417fc553a7bb2954dd28ecb8ea",
"Source": "/var/lib/docker/volumes/6cbb584e0d9aa2f119869b264544f587909d9f417fc553a7bb2954dd28ecb8ea/_data",
"Destination": "/etc/zookeeper/secrets",
"Driver": "local",
"Mode": "",
"RW": true,
"Propagation": ""
}
]
and data from the containers:
➜ docker exec zookeeper ls -l /var/lib/zookeeper/data /var/lib/zookeeper/log
/var/lib/zookeeper/data:
total 0
drwxr-xr-x 3 root root 96 Apr 3 08:59 version-2
/var/lib/zookeeper/log:
total 0
drwxr-xr-x 3 root root 96 Apr 3 08:59 version-2
➜ docker exec broker ls -l /var/lib/kafka/data
total 16
drwxr-xr-x 6 root root 192 Apr 3 08:59 __confluent.support.metrics-0
-rw-r--r-- 1 root root 0 Apr 3 08:59 cleaner-offset-checkpoint
-rw-r--r-- 1 root root 4 Apr 3 09:01 log-start-offset-checkpoint
-rw-r--r-- 1 root root 88 Apr 3 08:59 meta.properties
-rw-r--r-- 1 root root 36 Apr 3 09:01 recovery-point-offset-checkpoint
-rw-r--r-- 1 root root 36 Apr 3 09:02 replication-offset-checkpoint
-rw-r--r-- 1 root root 0 Apr 3 08:30 wibble
is stored on the local host:
➜ ls -l broker/data zoo/data zoo/log
broker/data:
total 32
drwxr-xr-x 6 rmoff wheel 192 3 Apr 09:59 __confluent.support.metrics-0
-rw-r--r-- 1 rmoff wheel 0 3 Apr 09:59 cleaner-offset-checkpoint
-rw-r--r-- 1 rmoff wheel 4 3 Apr 10:00 log-start-offset-checkpoint
-rw-r--r-- 1 rmoff wheel 88 3 Apr 09:59 meta.properties
-rw-r--r-- 1 rmoff wheel 36 3 Apr 10:00 recovery-point-offset-checkpoint
-rw-r--r-- 1 rmoff wheel 36 3 Apr 10:01 replication-offset-checkpoint
-rw-r--r-- 1 rmoff wheel 0 3 Apr 09:30 wibble
zoo/data:
total 0
drwxr-xr-x 3 rmoff wheel 96 3 Apr 09:59 version-2
zoo/log:
total 0
drwxr-xr-x 3 rmoff wheel 96 3 Apr 09:59 version-2
See also Data Volumes for Kafka and ZooKeeper