Consider some data:
WITH
sequences AS (
SELECT 0 as primary_key, 0 AS key1, 1 AS key2, 3 AS value1, 4 AS value2 UNION ALL
SELECT 0, 1, 0, 8, 5 UNION ALL
SELECT 0, 1, 1, 1, 2 UNION ALL
SELECT 1, 2, 1, 1, 1 UNION ALL
SELECT 1, 2, 3, 4, 4
),
agg AS (
SELECT
primary_key,
key1,
key2,
ARRAY_AGG((SELECT AS STRUCT f.* EXCEPT (key1, key2))) AS agg
FROM sequences AS f
GROUP BY
primary_key,
key1,
key2
)
SELECT
a.primary_key,
a.key1,
a.key2,
ANY_VALUE(TO_JSON_STRING((agg))) AS result
FROM
agg AS a
GROUP BY
a.primary_key,
a.key1,
a.key2
This gives me data in format of:
However for future faster retrieval; I would like to get data in format of:
Any pointer on how I can make this happen?
Thank you so much for such a rich post. It is fantastic that you included everything needed to tinker with a solution. Your well described input data definitely made me want to spend time examining.
This is one possible solution:
WITH
sequences AS (
SELECT 0 as primary_key, 0 AS key1, 1 AS key2, 3 AS value1, 4 AS value2 UNION ALL
SELECT 0, 1, 0, 8, 5 UNION ALL
SELECT 0, 1, 1, 1, 2 UNION ALL
SELECT 1, 2, 1, 1, 1 UNION ALL
SELECT 1, 2, 3, 4, 4
),
agg AS (
SELECT
primary_key,
key1,
key2,
ARRAY_AGG((SELECT AS STRUCT f.* EXCEPT (key1, key2))) AS agg
FROM sequences AS f
GROUP BY
primary_key,
key1,
key2
)
SELECT primary_key,
JSON_ARRAY(
JSON_OBJECT(CAST(key1 AS STRING),
JSON_OBJECT(CAST(key2 AS STRING),agg.agg[0])
)
) AS result FROM agg
In this example, I am using the JSON functions as described in the BigQuery documentation (ref). The solution uses the JSON_ARRAY
function to create an array and then the elements of the array are nested JSON_OBJECT
instances using the data from the query as the values.