google-bigquerynested-tablearray-agg

How to avoid an array uf NULLS when making nested table in BQ


I am joining 2 tables, the join is left, and second table does not have all keys from the first, so that does produce NULLs on the right. How do I avoid them when grouping this into an array?

For example:

with 
  t1 AS (select 1 AS c1, 2 AS c2 UNION ALL select 5, 6),
  t2 AS (select 3 AS c1, 4 AS c4 UNION ALL select 3, 5, UNION ALL select 3,4, UNION ALL select 5,6)
SELECT t2.c1, ARRAY_AGG(DISTINCT t2.c4), ARRAY_AGG((SELECT AS STRUCT t1.*))
FROM t2 LEFT JOIN t1 ON t1.c1=t2.c1
GROUP BY t2.c1

The result is:

enter image description here

What I excpect is that the first row will have only one row of NULLs for the second table, smth like (sorry for poor drawings):

enter image description here


Solution

  • is the removal of all null statements ok?

    Then use an if statement inside the array_agg

    with 
      t1 AS (select 1 AS c1, 2 AS c2 UNION ALL select 5, 6),
      t2 AS (select 3 AS c1, 4 AS c4 UNION ALL select 3, 5, UNION ALL select 3,4, UNION ALL select 5,6)
    SELECT t2.c1, ARRAY_AGG(DISTINCT t2.c4), ARRAY_AGG( if(t1.c1 is null   , null, STRUCT(t1) ) ignore nulls  ), 
    FROM t2 LEFT JOIN 
    t1 
     ON t1.c1=t2.c1
    
    GROUP BY t2.c1