I have my project set up using Spring Boot and Spring Kafka, and there are three consumers. Checking the logs, I can see that from time to time the consumers get disconnected:
catalina.out:2019-04-27 02:19:57.962 INFO 18245 --- [ntainer#2-0-C-1] o.a.kafka.clients.FetchSessionHandler : [Consumer clientId=consumer-2, groupId=FalconDataRiver1] Error sending fetch request (sessionId=1338157432, epoch=205630) to node 101: org.apache.kafka.common.errors.DisconnectException.
catalina.out:2019-04-27 02:19:57.962 INFO 18245 --- [ntainer#4-0-C-1] o.a.kafka.clients.FetchSessionHandler : [Consumer clientId=consumer-6, groupId=FalconDataRiver1] Error sending fetch request (sessionId=727942178, epoch=234691) to node 101: org.apache.kafka.common.errors.DisconnectException.
catalina.out:2019-04-27 02:19:57.962 INFO 18245 --- [ntainer#0-0-C-1] o.a.kafka.clients.FetchSessionHandler : [Consumer clientId=consumer-10, groupId=FalconDataRiver1] Error sending fetch request (sessionId=836405004, epoch=234351) to node 101: org.apache.kafka.common.errors.DisconnectException.
catalina.out:2019-04-27 02:19:58.023 INFO 18245 --- [ntainer#1-0-C-1] o.a.kafka.clients.FetchSessionHandler : [Consumer clientId=consumer-12, groupId=FalconDataRiver1] Error sending fetch request (sessionId=1385585601, epoch=234394) to node 101: org.apache.kafka.common.errors.DisconnectException.
catalina.out:2019-04-27 02:19:58.023 INFO 18245 --- [ntainer#3-0-C-1] o.a.kafka.clients.FetchSessionHandler : [Consumer clientId=consumer-4, groupId=FalconDataRiver1] Error sending fetch request (sessionId=452630289, epoch=201944) to node 101: org.apache.kafka.common.errors.DisconnectException.
catalina.out:2019-04-27 02:19:58.023 INFO 18245 --- [ntainer#5-0-C-1] o.a.kafka.clients.FetchSessionHandler : [Consumer clientId=consumer-8, groupId=FalconDataRiver1] Error sending fetch request (sessionId=78802572, epoch=103) to node 101: org.apache.kafka.common.errors.DisconnectException.
catalina.out:2019-04-27 02:19:58.040 INFO 18245 --- [ntainer#2-0-C-1] o.a.kafka.clients.FetchSessionHandler : [Consumer clientId=consumer-2, groupId=FalconDataRiver1] Error sending fetch request (sessionId=1338157432, epoch=INITIAL) to node 101: org.apache.kafka.common.errors.DisconnectException.
I haven't configured the consumers in terms of reconnection. I know that there are two properties from the Kafka documentation:
reconnect.backoff.max.ms
-- The maximum amount of time in milliseconds to wait when reconnecting to a broker that has repeatedly failed to connect. If provided, the backoff per host will increase exponentially for each consecutive connection failure, up to this maximum. After calculating the backoff increase, 20% random jitter is added to avoid connection storms. Default value 1000 milliseconds)
reconnect.backoff.ms
-- The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker. Default value 50 milliseconds)
I can see the three consumers are still consuming after the above logging messages. Obviously they have recovered from these disconnect exceptions. What bothers me is that there is nothing in the logs that record the process of reconnecting and recovery.
Am I missing something here? Thanks!
Kafka recovers from this internal error automatically and this is why the level of the log is INFO
. Evidently, your consumers are still able to consume the messages.
Switch log level to DEBUG
in case you want to get more information about what is causing this.