I'm trying to update ~5K rows in bq using python client.
Here is my current try:
update_bq(table_id=_ADS_TO_REMOVE_TABLE_NAME, set_clasue="status ='removed'",
where_in_clause=f'''[{[item['ad_id'] for item in
current_ads_removed_json_chunk]}]''')
and
def update_bq(self, table_id, ad_ids, set_clasue, where_in_clause):
table_full_name = self.get_table_full_name(table_id)
query_text = f"""
UPDATE `{table_full_name}`
SET {set_clasue}
WHERE account_id IN {where_in_clause}
"""
query_job = self.client.query(query_text)
How can I map the account id list
to a string as follows which seems more efficient(?)
UPDATE mytable SET somefield=( CASE WHEN (id=100) THEN 'some value removed' WHEN (id=101) THEN 'some value removed' END ) WHERE id IN (100,101);
I've tried:
f'''UPDATE mytable SET somefield=( CASE WHEN id={['(id=100) THEN some value \'removed\''.join( item['ad_id'] for item in current_ads_removed_json_chunk]}]+"WHERE id IN ("+ item['ad_id'] for item in current_ads_removed_json_chunk]);
Plus will the bq auto timestamp work on update (works for me on insertion).
Is it better to just remove all the rows with these ids and re-insert with the new status?
I don't know if this will be more effciet, but you can try to create whole sql script in python. Script can contains all values. Values will be passed as inject_table, and then you can update all rows with a signle query.
Doc: inject table and UPDATE FROM
with inject_table (
select 1 as col1, 2 as col2 UNION ALL
select 5 as col1, 3 as col2
)
update tt
set tt.val = it.col1
from target_table as tt
join inject_table as it on tt.col2 = it.col2