postgresqlapache-kafka-connectdebezium

Postgres Debezium does not publish the previous state of a record


I successfully installed Postgres Debezium CDC. Now, I'm able to catch all changes happening to the database. But the problem is "before" field always stays empty. So, if I insert a record (id = 1, name = Bill) I then get from Kafka this data:

'payload': {'before': None, 'after': {'id': 1, 'name': 'Bill'}, ...

But if I update the record like so:

UPDATE mytable set name = 'Bob' WHERE id = 1

I get this from Kafka:

'payload': {'before': None, 'after': {'id': 1, 'name': 'Bob'}, ...

This is how I configured my connector:

curl -X POST  localhost:8083/connectors/ \
  -H "Accept:application/json" -H "Content-Type:application/json" -d \
'{
    "name": "test-connector",
    "config": {
         "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
         "tasks.max": "1",
         "plugin.name": "pgoutput",
         "database.hostname": "postgres",
         "database.port": "5432",
         "database.user": "postgres",
         "database.password": "postgres",
         "database.dbname" : "test",
         "database.server.name": "postgres",
         "database.whitelist": "public.mytable",
         "database.history.kafka.bootstrap.servers": "kafka:9092",
         "database.history.kafka.topic": "public.topic"
    }
}'

What is wrong with that and how can I fix it?


Solution

  • before is an optional field that if present contains the state of the row before the event occurred. Whether or not this field is available is highly dependent on the REPLICA IDENTITY setting for each table.

    REPLICA IDENTITY is a PostgreSQL specific table-level setting which determines the amount of information that is available to logical decoding in case of UPDATE and DELETE events.

    To show the previous values of all the table’s columns, please set the REPLICA IDENTITY level to FULL:

    ALTER TABLE public.mytable REPLICA IDENTITY FULL;
    

    See more details in the Debezium docs.