I've been trying to set a cluster in Kubernetes with nodes based on MirrorMaker, using Kafka version 3.5.1.
The setting I used is based on what is described in this KPI, and relates to the usage of dedicated.mode.enable.internal.rest
, which has been delivered as part of JIRA #10586 - meaning it should be available since Kafka 3.5.0.
The issues I get are not related to the topics/offsets replication, this has been working fine - but basically:
I can leave it running for hours and the result remains the same.
Following is the mirror maker configuration I have tried:
clusters = source, destination
source.bootstrap.servers = <SOURCE_BOOTSTRAP_SERVERS>
destination.bootstrap.servers = <TARGET_BOOTSTRAP_SERVERS>
source->destination.enabled = true
source->destination.topics = <TARGET_TOPIC_WHITELIST>
groups=.*
topics.blacklist = .*[\-\.]internal
emit.heartbeats.enabled = true
source->destination.sync.group.offsets.enabled = true
# Auth config
security.protocol=SASL_SSL
source.security.protocol=SASL_SSL
destination.security.protocol=SASL_SSL
source.sasl.mechanism=SCRAM-SHA-512
destination.sasl.mechanism=SCRAM-SHA-512
source.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="<SOURCE_USERNAME>" password="<SOURCE_PASSWORD>";
destination.sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="<TARGET_USERNAME>" password="<TARGET_PASSWORD>";
replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicy
# rest API - new since 3.5.0
dedicated.mode.enable.internal.rest=true
which includes the flag set for enabling the REST API.
As the nodes are launched (via connect-mirror-maker.sh
) I can see in one of our replicas the following information:
[2023-07-27 12:36:10,825] INFO Advertised URI: http://<advertised_IP>:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:394)
[2023-07-27 12:36:10,826] INFO REST server listening at http://<advertised_IP>:8083/, advertising URL http://<advertised_IP>:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:202)
followed by the group join indication (the lines below can be verified in all replicas):
2023-07-27 12:36:34,719] INFO [Worker clientId=source->destination, groupId=source-mm2] Joined group at generation 44 with protocol version 2 and got assignment: Assignment{error=0, leader='source->destination-224a6033-361f-426b-9840-fc078eb87333', leaderUrl='http://<advertised_IP>:8083/', offset=705, connectorIds=[MirrorHeartbeatConnector], taskIds=[MirrorHeartbeatConnector-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder:2394)
If I try running curl targeting the leaderUrl above (on the root or at /connectors
) from any node, using the advertised IP (or localhost in the leader) I end up with {"error_code":404,"message":"HTTP 404 Not Found"}
I can see the busy pod CPU consumption going high as it replicates all messages, while other pods are free.
The maximum number of tasks did the desired effect, we started testing with tasks.max=10
and it already made a huge difference (we are now tuning it based on our needs). Some related context can be found on this answer.