I'm playing with BQ and I create a table and inserted some data. I reinserted it and it created duplicates. I'm sure I'm missing something, but is there something I can do to ignore it if the data exists in the table?
My use case is I get a stream of data from various clients and sometimes their data will include some data they previously already sent(I have no control on them submitting).
Is there a way to prevent duplicates when certain conditions are met? The easy one is if the entire data is the same but also if certain columns are present?
It's difficult to answer your question without a clear idea of the table structure, but it feels like you could be interested in the MERGE
statement: ref here.
With this DML statement you can perform a mix of INSERT
, UPDATE
, and DELETE
statements, hence do exactly what you are describing.