Folks!
I need some ideas, with the follow problem:
I have two tables:
Table 1:
+-------+------------+---------+
| ID | field_name | value |
+-------+------------+---------+
| 1 | usd | 10.08 |
| 1 | gross_amt | 52.0 |
| 1 | jpy | 30.05 |
| 2 | usd | 50.0 |
| 2 | eur | 50.0 |
| 3 | real_amt | 210.43 |
| 3 | total | 320 |
| 4 | jpy | 23.45 |
| 4 | name | john |
| 4 | city | utah |
+-------+------------+---------+
Table 2:
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| ID | name | last_name | date1 | counrty | city | usd | eur | jpy | gross_amt | real_amt | total | ... | field200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| 1 | jane | doe | 19900108 | usa | LA | 9.08 | 0.00 | 29.05 | 50.0 | 52.0 | 900.0 | ... | value200 |
| 2 | lane | smith | 19900108 | usa | LA | 40.8 | 40.0 | 0.00 | 100.0 | 70.0 | 290.0 | ... | value200 |
| 3 | mike | hoffa | 19900108 | usa | SF | 5.05 | 0.00 | 0.00 | 10.0 | 25.0 | 100.0 | ... | value200 |
| 4 | paul | doe | 19900108 | usa | NY | 1.00 | 0.00 | 29.05 | 45.0 | 55.0 | 110.0 | ... | value200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
I need to update the values of the fields in the table 2, which are in table 1 in column field_name
, with the values of table 1 column value
, both IDs are the same in both tables, beside that, the datatype of column value
in table 1 are string, but the data type of the columns to update in the table 2 are diferent, especially the numbers(numeric, int64, float64)
The tables above are an example, table 2 of the real problem has 200 fields and in table 1 for an ID there can be up to 40 value modifications for thousands of records to be modified daily
Thanks
I have tried the following two solutions:
Solution 1 (it works, but very slow, it is a lot of records):
DECLARE SQLSCRIPT STRING DEFAULT '';
DECLARE col, val, id STRING;
DECLARE n INT64;
DECLARE i INT64 DEFAULT 1;
SET n= (SELECT COUNT(*) FROM `project.dataset.table1`);
WHILE i <= n DO
SET col = (SELECT col FROM `project.dataset.table1` LIMIT 1);
SET val = (SELECT val FROM `project.dataset.table1` LIMIT 1);
SET id = (SELECT id FROM `project.dataset.table1` LIMIT 1);
SET SQLScript = (SELECT CONCAT('UPDATE `project.dataset.table2`` SET ',col,' = ',val,' WHERE id = ','"',id,'"'));
SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE SQLSCRIPT;
Solution 2 (I can't get it to work, it gives me the following error):
[error execution Big Query] [1]: https://i.sstatic.net/Pv44T.png
EXECUTE IMMEDIATE (SELECT STRING_AGG('UPDATE `project.dataset.table2` SET '||x.col||'="'||x.val||'" WHERE id = "'||x.id||'"', ';')
FROM UNNEST((SELECT ARRAY_AGG(STRUCT(id, col, val))
FROM `project.dataset.table1`)) AS x);
Below is for BigQuery Standard SQL
EXECUTE IMMEDIATE '''
CREATE TEMP TABLE pivot1 AS
SELECT id, ''' || (
SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', CAST(value AS " || data_type || "), NULL)) AS " || field_name)
FROM `project.dataset.table1`
JOIN (
SELECT column_name, data_type
FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS`
WHERE tablename = 'table2'
) ON field_name = column_name
) || '''
FROM `project.dataset.table1`
GROUP BY id
''';
EXECUTE IMMEDIATE '''
MERGE `project.dataset.table2` AS t2
USING pivot1 AS t1
ON t2.id = t1.id
WHEN MATCHED THEN
UPDATE SET
''' || (
SELECT STRING_AGG(DISTINCT field_name || ' = IFNULL(t1.' || field_name || ', t2.' || field_name || ')')
FROM `project.dataset.table1`
);
SELECT * FROM `project.dataset.table2` ORDER BY id;
If to apply to sample data (table1 and table2) from your question - output is (updates are highlighted)