sqlgoogle-cloud-platformgoogle-bigquery

Error with CONCAT Function While Constructing a Dynamic UNION ALL Query in BigQuery


I am trying to dynamically union all tables in a specific BigQuery dataset where the table names have the suffix "client". Due to the use of customer encryption keys, I cannot utilize wildcards, so I am using a FOR loop to build the query.

Here is the BigQuery script I am using:

DECLARE query STRING DEFAULT '';
DECLARE table_names ARRAY<STRING>;

SET table_names = (
  SELECT ARRAY_AGG(table_name) AS table_names
  FROM `project.dataset.INFORMATION_SCHEMA.TABLES`
  WHERE table_name LIKE '%client'
);

FOR table_name IN (SELECT * FROM UNNEST(table_names)) DO
  SET query = IF(
    query = '',
    CONCAT('SELECT * FROM `project.dataset.', table_name, '`'),
    CONCAT(query, ' UNION ALL SELECT * FROM `project.dataset.', table_name, '`')
  );
END FOR;

EXECUTE IMMEDIATE query;

However, I am encountering the following error:

Query error: No matching signature for function CONCAT for argument types: STRING, STRUCT<f0_ STRING>, STRING. Supported signatures: CONCAT(STRING, [STRING, ...]); CONCAT(BYTES, [BYTES, ...]) at [13:5]

It seems that the CONCAT function is having trouble with the types of arguments provided, particularly with the table_name.

Could someone help me understand why this error is occurring and suggest a solution to correctly build and execute the dynamic query?

Your help is much appreciated!


Solution

  • Meantime, below is the fix for your script in case if you need to run it this way:

    DECLARE query STRING DEFAULT '';
    DECLARE table_names ARRAY<STRING>;
    
    SET table_names = (
      SELECT ARRAY_AGG(table_name) AS table_names
      FROM `project.dataset.INFORMATION_SCHEMA.TABLES`
    #   WHERE table_name LIKE '%client'
    );
    
    FOR record IN (SELECT table_name FROM UNNEST(table_names) table_name) DO
      SET query = IF(
        query = '',
        CONCAT('SELECT * FROM `project.dataset.', record.table_name, '`'),
        CONCAT(query, ' UNION ALL SELECT * FROM `project.dataset.', record.table_name, '`')
      );
    END FOR;
    
    EXECUTE IMMEDIATE query;
    

    I think it is self descriptive. Most important to accept the fact that you are looping through records and each record is a struct. so if you need to reference specific field in that record you need to use record.column_name