sqlprestotrinounnest

SQL Unnest - how to use correctly?


Say I have some data in a table, t.

id, arr
--, ---
1,  [1,2,3]
2,  [4,5,6]

SQL

SELECT AVG(n) FROM UNNEST(
    SELECT arr FROM t AS n) AS avg_arr 

This returns the error, 'Mismatched input 'SELECT'. Expecting <expression>. What is the correct way to unnest an array and aggregate the unnested values?


Solution

  • unnest is normally used with a cross join and will expand the array into relation (i.e. for every element of array an row will be introduced). To calculate average you will need to group values back:

    -- sample data
    WITH dataset (id, arr) AS (
        VALUES (1,  array[1,2,3]),
            (2,  array[4,5,6])
    ) 
    
    --query
    select id, avg(n)
    from dataset
    cross join unnest (arr) t(n)
    group by id
    

    Output:

    id _col1
    1 2.0
    2 5.0

    But you also can use array functions. Depended on presto version either array_average:

    select id, array_average(n)
    from dataset
    

    Or for older versions more cumbersome approach with manual aggregation via reduce:

    select id, reduce(arr, 0.0, (s, x) -> s + x, s -> s) / cardinality(arr)
    from dataset