apache-kafkaspring-cloud-streamexactly-once

Spring Cloud Stream project with Failed to obtain partition information Error


When I use this configuration:

spring:
  cloud:
    stream:
      kafka:
        binder:
          min-partition-count: 1
          replication-factor: 1
  kafka:
    producer:
      transaction-id-prefix: tx-
      retries: 1
      acks: all

My application start correctly, but the transactional.id that I see in console output show null. I have applied this extra configuration(transaction) to spring-cloud-stream, in order to get the correct transactional.id:

spring:
  cloud:
    stream:
      kafka:
        binder:
          min-partition-count: 1
          replication-factor: 1
          transaction:
            transaction-id-prefix: txl-
  kafka:
    producer:
      transaction-id-prefix: tx-
      retries: 1
      acks: all

But the service is not started successfuly and the console output show this:

app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.435  INFO [poc,,,] 1 --- [           main] o.a.kafka.common.utils.AppInfoParser     : Kafka version: 2.5.1
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.437  INFO [poc,,,] 1 --- [           main] o.a.kafka.common.utils.AppInfoParser     : Kafka commitId: 0efa8fb0f4c73d92
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.437  INFO [poc,,,] 1 --- [           main] o.a.kafka.common.utils.AppInfoParser     : Kafka startTimeMs: 1606336069435
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.597  INFO [poc,,,] 1 --- [           main] o.a.k.clients.producer.ProducerConfig    : ProducerConfig values: 
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   acks = -1
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   batch.size = 16384
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   bootstrap.servers = [kafka:29092]
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   buffer.memory = 33554432
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   client.dns.lookup = default
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   client.id = producer-txl-1
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   compression.type = none
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   connections.max.idle.ms = 540000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   delivery.timeout.ms = 120000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   enable.idempotence = true
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   interceptor.classes = []
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   linger.ms = 0
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   max.block.ms = 60000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   max.in.flight.requests.per.connection = 5
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   max.request.size = 1048576
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   metadata.max.age.ms = 300000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   metadata.max.idle.ms = 300000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   metric.reporters = []
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   metrics.num.samples = 2
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   metrics.recording.level = INFO
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   metrics.sample.window.ms = 30000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   receive.buffer.bytes = 32768
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   reconnect.backoff.max.ms = 1000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   reconnect.backoff.ms = 50
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   request.timeout.ms = 30000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   retries = 1
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   retry.backoff.ms = 100
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.client.callback.handler.class = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.jaas.config = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.kerberos.kinit.cmd = /usr/bin/kinit
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.kerberos.min.time.before.relogin = 60000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.kerberos.service.name = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.kerberos.ticket.renew.jitter = 0.05
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.kerberos.ticket.renew.window.factor = 0.8
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.login.callback.handler.class = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.login.class = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.login.refresh.buffer.seconds = 300
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.login.refresh.min.period.seconds = 60
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.login.refresh.window.factor = 0.8
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.login.refresh.window.jitter = 0.05
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   sasl.mechanism = GSSAPI
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   security.protocol = PLAINTEXT
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   security.providers = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   send.buffer.bytes = 131072
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.cipher.suites = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.enabled.protocols = [TLSv1.2]
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.endpoint.identification.algorithm = https
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.key.password = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.keymanager.algorithm = SunX509
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.keystore.location = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.keystore.password = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.keystore.type = JKS
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.protocol = TLSv1.2
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.provider = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.secure.random.implementation = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.trustmanager.algorithm = PKIX
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.truststore.location = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.truststore.password = null
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   ssl.truststore.type = JKS
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   transaction.timeout.ms = 60000
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   transactional.id = txl-1
app_poc.1.nqc57nvh0qhr@ms-poc-02    |   value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.599  INFO [poc,,,] 1 --- [           main] o.a.k.clients.producer.KafkaProducer     : [Producer clientId=producer-txl-1, transactionalId=txl-1] Instantiated a transactional producer.
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.623  INFO [poc,,,] 1 --- [           main] o.a.kafka.common.utils.AppInfoParser     : Kafka version: 2.5.1
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.624  INFO [poc,,,] 1 --- [           main] o.a.kafka.common.utils.AppInfoParser     : Kafka commitId: 0efa8fb0f4c73d92
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.624  INFO [poc,,,] 1 --- [           main] o.a.kafka.common.utils.AppInfoParser     : Kafka startTimeMs: 1606336069623
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.626  INFO [poc,,,] 1 --- [           main] o.a.k.c.p.internals.TransactionManager   : [Producer clientId=producer-txl-1, transactionalId=txl-1] Invoking InitProducerId for the first time in order to acquire a producer ID
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:27:49.637  INFO [poc,,,] 1 --- [ producer-txl-1] org.apache.kafka.clients.Metadata        : [Producer clientId=producer-txl-1, transactionalId=txl-1] Cluster ID: 3wV8FW9yTfKSVhNwNMoC2Q
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 2020-11-25 20:28:49.630 ERROR [poc,,,] 1 --- [           main] o.s.c.s.b.k.p.KafkaTopicProvisioner      : Failed to obtain partition information
app_poc.1.nqc57nvh0qhr@ms-poc-02    | 
app_poc.1.nqc57nvh0qhr@ms-poc-02    | org.apache.kafka.common.errors.TimeoutException: Timeout expired after 60000milliseconds while awaiting InitProducerId

Failed to obtain partition information I think I am doing something wrong with my configuration(definitely)

My intention is to have Exactly Once, in order to avoid duplications. That's why I want to see that transactional.id

Extra info: My consumer is transactional using JPA and Kafka transaction together(transaction syncronization using chainedKafkaTransactionManager)

EDITED: In a @Configuration class I have these beans

   @Bean
    @Primary
    fun transactionManager(em: EntityManagerFactory): JpaTransactionManager {
        return JpaTransactionManager(em)
    }

    @Bean
    fun kafkaTransactionManager(producerFactory: ProducerFactory<Any, Any>): KafkaTransactionManager<*, *> {
        return KafkaTransactionManager(producerFactory)
    }

    @Bean
    fun chainedTransactionManager(
        kafkaTransactionManager: KafkaTransactionManager<String, String>,
        transactionManager: JpaTransactionManager,
    ): ChainedKafkaTransactionManager<Any, Any> {
        return ChainedKafkaTransactionManager(kafkaTransactionManager, transactionManager)
    }

    @Bean
    fun kafkaListenerContainerFactory(
        configurer: ConcurrentKafkaListenerContainerFactoryConfigurer,
        kafkaConsumerFactory: ConsumerFactory<Any, Any>,
        chainedKafkaTransactionManager: ChainedKafkaTransactionManager<Any, Any>,
    ): ConcurrentKafkaListenerContainerFactory<*, *> {
        val factory = ConcurrentKafkaListenerContainerFactory<Any, Any>()
        configurer.configure(factory, kafkaConsumerFactory)
        factory.containerProperties.transactionManager = chainedKafkaTransactionManager
        return factory
    }

And my processor class with the corresponding @Transactional

@EnableKafka
@EnableBinding(Channels::class)
@Service
@Transactional
class EventProcessor()
...

According to me with the first configuration showed, transactional synchronizations works.

I used this logging configuration to confirm Initializing transaction synchronization and Clearing transaction synchronization of TransactionSynchronizationManager.

logging:
  level:
    org.springframework.kafka: trace
    org.springframework.transaction: trace

Solution

  • See this answer.

    You most likely don't have enough replicas or in-sync replicas for the transaction log topics.

    using ChainedKafkaTransactionManager

    That is only supported in spring-cloud-stream (out of the box) for producer-only transactions. For consume->produce->publishToKafka operations, you must use @Transactional on the listener, with just the JPA transaction manager; the result is similar to transaction synchronization.

    Or, you must inject a properly configured CKTM into the binding's listener container.

    You need to show your code and the rest of the configuration.