sqlgoogle-bigqueryprocedural

When running a For loop in Big Query the sql_expression_list is not being evaluated the same as if it was run alone


I am running a For loop to gather a table_expression from a dataset's information schema that contains the names of columns that are of ARRAY type in a table. I am able to generate this table_expression as intended. From here I am trying to loop through the columns and evaluate each column's contents. The columns all contain an Array of Strings with "empty" items having a single whitespace.

 FOR field IN (
  SELECT
    field_path, data_type
  FROM
    `project.dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`
  WHERE
    table_name = 'table_1'
    AND data_type LIKE 'ARRAY%' ) DO
  SELECT
    ARRAY_LENGTH(field.field_path) AS s3,
  IF
    (REGEXP_CONTAINS(ARRAY_TO_STRING(field.field_path, ','),r'.*?\,\s\,.*?'),1,0) AS b1,
  IF
    (REGEXP_CONTAINS(ARRAY_TO_STRING(field.field_path, ','),r'^\s\,.*?'),1,0) AS b2,
  IF
    (REGEXP_CONTAINS(ARRAY_TO_STRING(field.field_path, ','),r'.*?\,\s$'),1,0) AS b3
  FROM
    `project.dataset.table_1`);
END
  FOR;

When I run this query I receive the error:

Query error: No matching signature for function ARRAY_LENGTH for argument types: STRING. Supported signature: ARRAY_LENGTH(ARRAY)

If I isolate the query after the DO, the query runs without this error and processes as intended.

I have run a similar loop to produce null counts on columns and had success, so it seems to be an issue with evaluating Arrays but I am unsure of what is causing it.

I have tried running the same query, treating field.field_path as a String value, but I do not receive the same results as when I run the DO query stand-alone and I believe it might be evaluating the field.field_path as a string value and not a variable.


Solution

  • I believe it might be evaluating the field.field_path as a string value and not a variable.

    Since field is a variable, it can't be used as a column name. Instead, you can consider a dynamic SQL like below.

    For example,

    FOR field IN (
      SELECT
        field_path, data_type
      FROM
        `project.dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`
      WHERE
        table_name = 'table_1'
        AND data_type LIKE 'ARRAY%' 
    ) DO
    EXECUTE IMMEDIATE FORMAT("""
      SELECT
        ARRAY_LENGTH(%s) AS s3
      FROM
        `project.dataset.table_1`
    """, field.field_path);
    END FOR;