I am running a For loop to gather a table_expression from a dataset's information schema that contains the names of columns that are of ARRAY type in a table. I am able to generate this table_expression as intended. From here I am trying to loop through the columns and evaluate each column's contents. The columns all contain an Array of Strings with "empty" items having a single whitespace.
FOR field IN (
SELECT
field_path, data_type
FROM
`project.dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`
WHERE
table_name = 'table_1'
AND data_type LIKE 'ARRAY%' ) DO
SELECT
ARRAY_LENGTH(field.field_path) AS s3,
IF
(REGEXP_CONTAINS(ARRAY_TO_STRING(field.field_path, ','),r'.*?\,\s\,.*?'),1,0) AS b1,
IF
(REGEXP_CONTAINS(ARRAY_TO_STRING(field.field_path, ','),r'^\s\,.*?'),1,0) AS b2,
IF
(REGEXP_CONTAINS(ARRAY_TO_STRING(field.field_path, ','),r'.*?\,\s$'),1,0) AS b3
FROM
`project.dataset.table_1`);
END
FOR;
When I run this query I receive the error:
Query error: No matching signature for function ARRAY_LENGTH for argument types: STRING. Supported signature: ARRAY_LENGTH(ARRAY)
If I isolate the query after the DO, the query runs without this error and processes as intended.
I have run a similar loop to produce null counts on columns and had success, so it seems to be an issue with evaluating Arrays but I am unsure of what is causing it.
I have tried running the same query, treating field.field_path as a String value, but I do not receive the same results as when I run the DO query stand-alone and I believe it might be evaluating the field.field_path as a string value and not a variable.
I believe it might be evaluating the field.field_path as a string value and not a variable.
Since field
is a variable, it can't be used as a column name. Instead, you can consider a dynamic SQL like below.
For example,
FOR field IN (
SELECT
field_path, data_type
FROM
`project.dataset.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`
WHERE
table_name = 'table_1'
AND data_type LIKE 'ARRAY%'
) DO
EXECUTE IMMEDIATE FORMAT("""
SELECT
ARRAY_LENGTH(%s) AS s3
FROM
`project.dataset.table_1`
""", field.field_path);
END FOR;