sqlpostgresqlgoogle-bigqueryaggregates

Postgres SQL aggregates query in BigQuery?


For Below Postgres SQL query, I do use PIVOT in BigQuery, beside PIVOT, any other method for such query in BigQuery?

-- Postgres SQL --
    SELECT 
       Apple, 
       Orange,
       Lemon,
       CASE WHEN Apple >= 50 THEN 1 ELSE 0 END AS Apple50
       CASE WHEN Orange >= 50 THEN 1 ELSE 0 END AS Orange50
       CASE WHEN Lemon >= 50 THEN 1 ELSE 0 END AS Lemon50
    FROM (
       SELECT td.timestamp,
          COALESCE(MAX(td.value) FILTER (WHERE attribute_id = 16), 0) as Apple, 
          COALESCE(MAX(td.value) FILTER (WHERE attribute_id = 17), 0) as Orange, 
          COALESCE(MAX(td.value) FILTER (WHERE attribute_id = 18), 0) as Lemon
       FROM TableData td
       WHERE td.attribute_id IN (16, 17, 18) 
       GROUP BY td.timestamp
       ORDER BY timestamp;
    ) AS td2

-- My attempt BigQuery Query -- 
SELECT
    value_16 as Apple,
    value_17 as Orange,
    value_18 as Lemon,
    CASE WHEN value_16 >= 50 THEN 1 ELSE 0 END as Apple50
    CASE WHEN value_17 >= 50 THEN 1 ELSE 0 END as Orange50
    CASE WHEN value_18 >= 50 THEN 1 ELSE 0 END AS Lemon50
FROM (
        SELECT * FROM(
            SELECT 
                timestamp,
                attribute_id,
                value
            FROM `PROJECT_ID.DB_NAME.FRUITS` as td
            WHERE td.attribute_id IN (16,17,18)
        )PIVOT
        (
            MAX(value) as value
            FOR attribute_id IN (16,17,18)
        )
)as td2

Below is the sample relation of the table.

-- TableData --
attribute_id  | value     | timestamp  |
--------------+-----------+------------+
17            | 100       | 1618822794 |
17            | 100       | 1618822861 |
16            | 50        | 1618822794 |
16            | 50        | 1618822861 |

-- TableAttribute --
id            | name     |
--------------+----------+
16            | Apple    |
17            | Orange   |
18            | Lemon    |

-- Expected Result --
timestamp     | Apple   | Orange | Lemon | Apple50 | Orange50 | Lemon50 |
--------------+---------+--------+-------+---------+----------+---------+
1618822794    | 50      | 100    | 0     | 1       | 1        | 0
1618822861    | 50      | 100    | 0     | 1       | 1        | 0

Solution

  • Pivot is likely the best way to achieve what you're wanting. Consider the following approach though as it might be simpler to manage:

    with aggregate_data as (
        select td.timestamp
            , ta.name
            , td.value as value 
        from TableData td
        full outer join TableAttribute ta
        on td.attribute_id = ta.id
    )
    
    select timestamp
        , value_Apple as Apple
        , value_Orange as Orange
        , value_Lemon as Lemon
        , _50_Apple as Apple50
        , _50_Orange as Orange50
        , _50_Lemon as Lemon50
    from aggregate_data
    pivot(max(value) value, max(case when value >=50 then 1 else 0 end) _50 for name in ('Apple', 'Orange', 'Lemon'))
    where timestamp is not null