When creating an incremental table, the behavior is that when there is a matched, defined by "uniqueKey", the other fields will be updated. I would like to, instead, not update when there is a match, and only insert new rows without duplicates. How can I achieve this? An alternative I can think of is to write my own operation, but then I lose the benefit of previewing with the SELECT statement.
config {
type: "incremental",
uniqueKey: ["alarm_number"],
bigquery: {
partitionBy: "DATE(alarm_time)",
clusterBy: ["alarm_number", "location", "element"],
updatePartitionFilter: "time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 4 DAY)"
}
}
pre_operations {
DECLARE trigger_checkpoint DEFAULT (
${when(incremental(),
`SELECT MAX(time) FROM ${self()}`,
`SELECT TIMESTAMP("2020-01-01")`)}
)
}
SELECT
*,
FALSE AS notified -- this column should not be updated when there is a duplicate/merge
FROM ${ref("source")}
AND deviation >= 1.1
The way I have been approaching this case is by verifying myself if the row is already in the destination table or not using the self()
reference.
In your case, I would do something like this:
SELECT
s.*,
FALSE AS notified -- this column should not be updated when there is a duplicate/merge
FROM ${ref("source")} s
LEFT JOIN ${self()} se on s.alarm_number = se.alarm_number
where
se.alarm_number is null