debeziumchange-data-capture

How to apply Debezium CDC events from Pub/Sub onto a database?


Currently, I am setting up an environment to test CDC (Change Data Capture) with Apache Debezium for replicating database tables from two databases to one. The CDC events are captured and sent to Google Pub/Sub topics, but I have no idea of how to propagate those messages to the target database. I am ultimately trying to find the missing piece in the puzzle that is highlighted as a question mark in the image down below. How can I connect my Google Pub/Sub topic messages to my target database (DB C)?

CDC test environment

DB A and DB B are both PostgreSQL database instances and the database system for DB C is not determined.

Each message in my Pub/Sub topics contains a Debezium CDC event.

Pub/Sub sample message

Also, I am planning to stick with Google Pub/Sub as a message broker if possible.


Solution

  • The idea of debezium is to take data at rest (DB A and DB B) and get it into motion. It uses Change Data Capture for it. Once a target messaging system Kafka / Kinesis / Event Hub / Pub Sub is chosen the data is then sent to a set of topics on this messaging system

    From this messaging system , there has to be some application that will read this data and drop this data into a Sink (DB-C). This could be any application that is capable of reading from the messaging applications and having connectivity to DB-C . Modern realtime systems are capable of that. Cloud providers also give options for it. Examples of these include systems such as Storm , Spark , Flink etc that can consume off these topics and then persist data. Since it is Google pub-sub , there may be some options to read off it natively as well