springasynchronousapache-kafkasendproducer

Spring Kafka asynchronous send calls block


I'm using Spring-Kafka version 1.2.1 and, when the Kafka server is down/unreachable, the asynchronous send calls block for a time. It seems to be the TCP timeout. The code is something like this:

ListenableFuture<SendResult<K, V>> future = kafkaTemplate.send(topic, key, message);
future.addCallback(new ListenableFutureCallback<SendResult<K, V>>() {
    @Override
    public void onSuccess(SendResult<K, V> result) {
        ...
    }

    @Override
    public void onFailure(Throwable ex) {
        ...
    }
});

I've taken a really quick look at the Spring-Kafka code and it seems to just pass the task along to the kafka client library, translating a callback interaction to a future object interaction. Looking at the kafka client library, the code gets more complex and I didn't take the time to understand it all, but I guess it may be making remote calls (metadata, at least?) in the same thread.

As a user, I expected the Spring-Kafka methods that return a future to return immediately, even if the remote kafka server is unreachable.

Any confirmation if my understanding is wrong or if this is a bug would be welcome. I ended up making it asynchronous on my end for now.

Another problem is that Spring-Kafka documentation says, at the beginning, that it provides synchronous and asynchronous send methods. I couldn't find any methods that do not return futures, maybe the documentation needs updating.

I'm happy to provide any further details if needed. Thanks.


Solution

  • If I look at the KafkaProducer itself, there are two parts of sending a message:

    1. Storing the message into the internal buffer.
    2. Uploading the message from the buffer into Kafka.

    KafkaProducer is asynchronous for the second part, not the first part.

    The send() method can still be blocked on the first part and eventually throw TimeoutExceptions, e.g:

    If the server is completely unresponsive, you will probably encounter both issues.

    Update:

    I tested and confirmed this in Kafka 2.2.1. It looks like this behaviour might be different in 2.4 and/or 2.6: KAFKA-3720