google-bigquerybigquery-udf

Group by multiple nested fields and return result in BigQuery


I have a super market data and I want to list all super market branches in the city and then all the departments in that super market like grocery, cosmetics. List sales for each department and finally calculate total sale in each department. So my response should look something like below

Branch Department Item Total Amt
1 Grocery Rice 150
Wheat 200
Cosmetics Cream 300
Powder 100
2 Grocery Oil 400
Dal 200
Cosmetics Serum 150
Shampoo 100

I have tried the below query

SELECT branch, department, item , totalAmt
   FROM project.dataset.table 
   GROUP BY branch, department
   ORDER BY branch, department limit 100

But it is giving result like below:

Branch Department Item Total Amt
1 Grocery Rice 150
1 Grocery Wheat 200
1 Cosmetics Cream 300
1 Cosmetics Powder 100
2 Grocery Oil 400
2 Grocery Dal 200
2 Cosmetics Serum 150
2 Cosmetics Shampoo 100

Is it possible to group by Branch and inside that Department and get results like how it is in the first table using BigQuery?


Solution

  • Use array_agg and struct function after summing to get the total amount in a subquery. See below

    to create a sample data set

    WITH ProductData AS (
        SELECT 1 AS branch, 'Grocery' AS department, 'Rice' AS item, 150 AS Value UNION ALL
        SELECT 1, 'Grocery', 'Wheat', 200 UNION ALL
        SELECT 1, 'Cosmetics', 'Cream', 300 UNION ALL
        SELECT 1, 'Cosmetics', 'Powder', 100 UNION ALL
        SELECT 2, 'Grocery', 'Oil', 400 UNION ALL
        SELECT 2, 'Grocery', 'Dal', 200 UNION ALL
        SELECT 2, 'Cosmetics', 'Serum', 150 UNION ALL
        SELECT 2, 'Cosmetics', 'Shampoo', 100 union all
         SELECT 1, 'Grocery', 'Wheat', 99 UNION ALL
        SELECT 1, 'Cosmetics', 'Cream', 33 UNION ALL
        SELECT 1, 'Cosmetics', 'Powder', 11 UNION ALL
        SELECT 2, 'Grocery', 'Oil', 44 UNION ALL
        SELECT 2, 'Grocery', 'Dal', 22 UNION ALL
        SELECT 2, 'Cosmetics', 'Serum', 15 UNION ALL
        SELECT 2, 'Cosmetics', 'Shampoo', 13
    )
    

    to get the desired results

    SELECT branch, 
           department,
           ARRAY_AGG(STRUCT(item AS item, total_amt AS amount)) AS items
    FROM (
        SELECT branch, department, item, SUM(value) AS total_amt
        FROM ProductData
        GROUP BY branch, department, item
    )
    

    GROUP BY branch, department ORDER BY branch, department;

    results enter image description here