I am puzzled at this scenario, imagine I have two options: (I have M nodes all having the same tables)
kafka_table -> MV_1 -> local_table_1
-> MV_2 -> local_table_2
...
-> MV_N -> local_table_N
In this case, when an insertion in any of the local_table_<id>
fails, the consumer marks this as a failed consume, and tries to reconsume the message at the current offset, and will not commit a new offset.
But in a new scenario:
kafka_table -> MV_1 -> dist_table_1 -> local_table_1
-> MV_2 -> dist_table_2 -> local_table_2
...
-> MV_N -> dist_table_N -> local_table_N
I don't know what will exactly happen. When will a new kafka offset be commited. Clickhouse by default uses async_insertions
for distributed tables, will the new kafka offset be commited when this background insert job is created? or when it is successful, how does clickhouse manages this sync/async mechanism in this case?
Materialized View doesn't know about engine of destination table
AFAIK engine=Kafka will commit offset when all MVs first level return OK for insert into destination table
for engine=Distirbuted with default settings it means when data will successfully stored on local temporary .bin files in initiator node