apache-kafkaapache-kafka-connectconfluent-platform

Confluent Sink Connectors - How Many is Too Many


  1. I want to use Confluent sink connector to update Postgres database on remote servers. On average day we may have 1000-2000 messages (updates). Is it "legitimate" to create hundreds of sink connectors to copy those messages to all the remote servers? Having that many sink connectors is normal?

  2. Is it possible to tell the sink connector to "spread" their work so not all 100's of connectors will push the changes to the remote servers at the same time? Should "timestamp.delay.interval.ms" be used for that?

Thanks


Solution

    1. In theory you could have one connector per target server. One connector can stream data from multiple topics to a single server. You may find yourself increasing the number of connectors if configuration varies by topic (such as different primary key column names, differing insert.mode requirements, etc) - and that is totally valid ("legitimate" 😄)

    2. Each connector will spawn one, or more, tasks to carry out the work. If one connector is streaming data for multiple target objects it can parallelise that work with concurrent tasks if you want it to. If you want to stream data in a serial fashion so that there is just the one connection to the database then set tasks.max accordingly.

    To learn more about the connector/task execution model see docs.