I am trying to transfer data from oracleDB to mongoDB using kafka. So I configured the kafka cluster like the picture above. I know that adjusting partition and tasks.max allows parallel processing. However, when I run the connector, it always runs as a single task and cannot be processed in parallel. Are there any additional settings I need to do?
Here is what I configured.
bin/kafka-topics.sh --create --bootstrap-server 127.0.0.1:9092,127.0.0.2:9092,127.0.0.1:9093 --partitions 3 --topic topicA
connector config
{
"name": "rawsumc-source",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:oracle:thin:@127.0.0.1:1521/orcl",
"connection.user": "test",
"connection.password": "test",
"topic.prefix": "topicA",
"mode": "bulk",
"poll.interval.ms": "360000000",
"numeric.mapping": "best_fit",
"tasks.max": "10",
"connection.type": "lz4",
"query": "select CAST(NO_TT AS NUMBER(10,0)) AS NO_TT,CAST(NO_SEQ AS NUMBER(10,0)) AS NO_SEQ,DNT_CLCT from table_a",
"name": "rawsumc-source"
},
"tasks": [
{
"connector": "rawsumc-source",
"task": 0
}
],
"type": "source"}
According the docs:
tasks.max
- The maximum number of tasks that should be created for this connector. The connector may create fewer tasks if it cannot achieve this level of parallelism.
Using a custom query for JdbcSourceConnector limits you to a single task.