google-cloud-platformgoogle-bigquerygoogle-cloud-pubsub

_CHANGE_TYPE not working when streaming Google Big Query rows from Pub/Sub


The following link (https://cloud.google.com/pubsub/docs/bigquery#change_data_capture) states that when streaming rows into Google Big Query from Pub/Sub, we should be able to set _CHANGE_TYPE = 'UPSERT' in the JSON in order to make sure the data is updated by Primary Key.

I currently have a primary key set and can write intial rows while testing, but when I try to rewrite the same row with the _CHANGE_TYPE value included, the message remains unack'd in the subscription.

I'm not able to find any relevant errors in logging.

Has anyone else had this issue or gotten this feature working?

Sample Schema:

id: STRING,
time: TIMESTAMP

Sample Pub/Sub Message 1:

{
  "id": "abcd1234",
  "time": "2024-09-01 00:00:00"
}

Sample Pub/Sub Message 1 (with UPSERT):

{
  "id": "abcd1234",
  "time": "2024-09-02 00:00:00",
  "_CHANGE_TYPE": "UPSERT"
}

Solution

  • My issue was the PubSub service account did not have enough permissions.

    Depending on your use case you may have 3 entirely unique permissions scenarios, but to be sure I had this working for WRITE, UPSERT & DELETE I just gave my PubSub service account the BigQuery Admin permission. Probably overkill but works.