I have a table that looks like the one below:
Shop | Year | Region | Waste | Avg Waste (Year,Region) | Lift | Column_I_want_To_Calculate (apply case when statements) CASE WHEN Lift > Avg(Lift) OVER (PARTITION BY YEAR, REGION) THEN 1 ELSE 0 END |
---|---|---|---|---|---|---|
a | 2021 | CA | 10 | 15 =>(10+20)/2 | 0.67 => 10/15 | 0.67 < (0.67+1.34)/2 = 1.005 THEN 0 |
b | 2021 | CA | 20 | 15=> (10+20)/2 | 1.34 => 20/15 | 1.34 > (0.67+1.34)/2 = 1.005 THEN 1 |
c | 2021 | FL | 8 | 8 => 8/1 | 8/8 | 8 = 8 THEN 0 |
d | 2020 | LA | 25 | 22 => (25+19)/2 | 0.88 => 25/22 | 0.88 > (0.88+0.87)/2 = 0.875 THEN 1 |
e | 2020 | LA | 19 | 22 => (25+19)/2 | 0.87 => 19/22 | 0.87 < (0.88+0.87)/2 = 0.875 THEN 0 |
f | 2019 | NY | 35 | 35 | 35/35 | 35 = 35 THEN 0 |
So far I have calculated the columns Shop, Year, Region, Waste, Avg Waste (Year, Region), Lift. I want to calculate the one marked as Column_I_want_To_Calculate.
Briefly, it computes the average lift per Region and Year and compares Shops' Lift
with the Average Lift of all shops in the same Region
and Year
. Then assigns the value 1 or 0 in case of a greater than statement.
So far I have tried (PostgreSQL),
SELECT shop
,year
,region
,waste
,AVG(waste) over (partition by year, region) as "Avg Waste (Year,Region)"
,waste/avg(waste) over (partition by year, region) AS Lift,
,CASE WHEN waste/avg(waste) over (partition by year, region) >
(SELECT tab2.avg_lift
FROM (
SELECT tab1.year, tab1.region, AVG(tab1.lift) OVER (PARTITION BY tab1.year, tab1.region) avg_lift
FROM (
SELECT year, region, waste/ avg(waste) over (partition by year, region) AS lift
FROM main_table
GROUP BY year,region,waste
ORDER BY lift DESC
) tab1
GROUP BY tab1.year, tab1.region, tab1.lift
) tab2
) THEN 1 ELSE 0 END AS "Column_I_want_To_Calculate"
FROM main_table
GROUP BY shop,
year,
nomos,
waste
;
However, the code above throws the exception
postgresql error: more than one row returned by a subquery used as an expression
This one returns the required output based on your input:
SELECT
region
, shop
, waste
, round(AVG(waste) OVER w,2) AS avg_waste
, round(waste / AVG(waste) OVER w,2) AS lift
, CASE
WHEN waste > AVG(waste) OVER w THEN 1
ELSE 0
END AS above_average
FROM i
WINDOW w AS (PARTITION BY year, region)
ORDER BY
1,2,3;