I want to periodically insert data from an MySQL database into clickHouse, i.e., when data is added/updated in mySQL database, I want that data to be added automatically to clickHouse.
I am thinking of using the Change Data Capture (CDC). CDC is a technique that captures changes made to data in MySQL and applies it to the destination ClickHouse table. It only imports changed data, not the entire database. To use the CDC method with a MySQL database, we must utilize the Binary Change Log (binlog). Binlog allows us to capture change data as a stream, enabling near real-time replication.
Binlog not only captures data changes (INSERT, UPDATE, DELETE) but also table schema changes such as ADD/DROP COLUMN. It also ensures that rows deleted from MySQL are also deleted in ClickHouse.
After having the changes, How can I insert it in the ClickHouse?
[experimental] MaterializedMySQL Creates ClickHouse database with all the tables existing in MySQL, and all the data in those tables.
ClickHouse server works as MySQL replica. It reads binlog and performs DDL and DML queries.
Keep in mind that MaterializedMySQL is now (2024) absolute and can not be used:
https://clickhouse.tech/docs/en/engines/database-engines/materialized-mysql/
Other options to consider:
https://altinity.com/blog/2018/6/30/realtime-mysql-clickhouse-replication-in-practice
https://altinity.com/blog/dictionaries-explained https://altinity.com/blog/2020/5/19/clickhouse-dictionaries-reloaded