sqlgoogle-bigquerysql-updateexecute-immediate

How update a table in Big Query where the name of fields to update are values in another table


Folks!

I need some ideas, with the follow problem:

I have two tables:

Table 1:

+-------+------------+---------+
| ID    | field_name | value   |
+-------+------------+---------+
| 1     | usd        |  10.08  |
| 1     | gross_amt  |  52.0   |
| 1     | jpy        |  30.05  |
| 2     | usd        |  50.0   |
| 2     | eur        |  50.0   |
| 3     | real_amt   |  210.43 |
| 3     | total      |  320    |
| 4     | jpy        |  23.45  |
| 4     | name       |  john   |
| 4     | city       |  utah   |
+-------+------------+---------+

Table 2:

+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| ID  |  name | last_name |   date1  | counrty | city |  usd  |  eur  |  jpy  | gross_amt | real_amt | total | ... | field200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+
| 1   |  jane | doe       | 19900108 |   usa   | LA   | 9.08  | 0.00  | 29.05 | 50.0      |  52.0    | 900.0 | ... | value200 |
| 2   |  lane | smith     | 19900108 |   usa   | LA   | 40.8  | 40.0  | 0.00  | 100.0     |  70.0    | 290.0 | ... | value200 |
| 3   |  mike | hoffa     | 19900108 |   usa   | SF   | 5.05  | 0.00  | 0.00  | 10.0      |  25.0    | 100.0 | ... | value200 |
| 4   |  paul | doe       | 19900108 |   usa   | NY   | 1.00  | 0.00  | 29.05 | 45.0      |  55.0    | 110.0 | ... | value200 |
+-----+-------+-----------+----------+---------+------+-------+-------+-------+-----------+----------+-------+-----+----------+

I need to update the values of the fields in the table 2, which are in table 1 in column field_name, with the values of table 1 column value, both IDs are the same in both tables, beside that, the datatype of column value in table 1 are string, but the data type of the columns to update in the table 2 are diferent, especially the numbers(numeric, int64, float64)

The tables above are an example, table 2 of the real problem has 200 fields and in table 1 for an ID there can be up to 40 value modifications for thousands of records to be modified daily

Thanks

I have tried the following two solutions:

Solution 1 (it works, but very slow, it is a lot of records):

DECLARE SQLSCRIPT STRING DEFAULT '';
DECLARE col, val, id STRING;
DECLARE n INT64;
DECLARE i INT64 DEFAULT 1;

SET n= (SELECT COUNT(*) FROM `project.dataset.table1`);

WHILE i <= n DO
    SET col = (SELECT col FROM `project.dataset.table1`  LIMIT 1);
    SET val = (SELECT val FROM `project.dataset.table1`  LIMIT 1);
    SET id = (SELECT id FROM `project.dataset.table1`  LIMIT 1);
    SET SQLScript = (SELECT CONCAT('UPDATE `project.dataset.table2`` SET ',col,' = ',val,' WHERE id = ','"',id,'"'));
    SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE  SQLSCRIPT;

Solution 2 (I can't get it to work, it gives me the following error):

[error execution Big Query] [1]: https://i.sstatic.net/Pv44T.png

EXECUTE IMMEDIATE (SELECT STRING_AGG('UPDATE `project.dataset.table2` SET '||x.col||'="'||x.val||'" WHERE id = "'||x.id||'"', ';')
                      FROM UNNEST((SELECT ARRAY_AGG(STRUCT(id, col, val))
                                    FROM `project.dataset.table1`)) AS x);

Solution

  • Below is for BigQuery Standard SQL

    EXECUTE IMMEDIATE '''
    CREATE TEMP TABLE pivot1 AS
    SELECT id, ''' || (
      SELECT STRING_AGG(DISTINCT "MAX(IF(field_name = '" || field_name || "', CAST(value AS " || data_type || "), NULL)) AS " || field_name)
      FROM `project.dataset.table1`
      JOIN (
        SELECT column_name, data_type
        FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS`
        WHERE tablename = 'table2' 
      ) ON field_name = column_name
    ) || '''  
    FROM `project.dataset.table1`
    GROUP BY id
    ''';
    
    EXECUTE IMMEDIATE '''
    MERGE `project.dataset.table2` AS t2
    USING pivot1 AS t1
    ON t2.id = t1.id
    WHEN MATCHED THEN
      UPDATE SET
    ''' || (
      SELECT STRING_AGG(DISTINCT field_name || ' = IFNULL(t1.' || field_name || ', t2.' || field_name || ')')
      FROM `project.dataset.table1` 
    );
    
    SELECT * FROM `project.dataset.table2` ORDER BY id;
    

    If to apply to sample data (table1 and table2) from your question - output is (updates are highlighted)

    enter image description here