Hi I'm trying to populate the row_number using row_number() over (partition, but the results is not giving me the row number 1 where the first record is cumulative_cost >= threshold and the subsequent rows all have the same row number.
This is the result
date | cost_amt | cumulative_cost | threshold_reached_order | product_id |
---|---|---|---|---|
9/07/2023 | 14.09 | 14.09 | 0 | 12345 |
9/07/2023 | 10.2 | 24.29 | 0 | 12345 |
9/07/2023 | 25.03 | 49.32 | 3 | 12345 |
11/07/2023 | 28.09 | 77.41 | 4 | 12345 |
I'm expecting this result
date | cost_amt | cumulative_cost | threshold_reached_order | product_id |
---|---|---|---|---|
9/07/2023 | 14.09 | 14.09 | 0 | 12345 |
9/07/2023 | 10.2 | 24.29 | 0 | 12345 |
9/07/2023 | 25.03 | 49.32 | 1 | 12345 |
11/07/2023 | 28.09 | 77.41 | 2 | 12345 |
select date, cost_amt, cumulative_cost,
case when cumulative_cost >= threshold then
row_number() OVER (partition BY product_name, userid ORDER BY date, order_id)
else 0
end AS threshold_reached_order,
product_id
from table1
where userid = 'ABCDEFG'
order by userid , product_id
The row_number() OVER (partition BY product_name, userid ORDER BY date, order_id) is a window function that assigns a unique number to each row within a partition, which is determined by the product_name and userid columns. The rows are ordered by date and order_id.
The rows are already given the number and then you are putting the condition.
Before cumulative_cost >= threshold condition:
date | cost_amt | cumulative_cost | threshold_reached_order | product_id |
---|---|---|---|---|
9/07/2023 | 14.09 | 14.09 | 1 | 12345 |
9/07/2023 | 10.2 | 24.29 | 2 | 12345 |
9/07/2023 | 25.03 | 49.32 | 3 | 12345 |
11/07/2023 | 28.09 | 77.41 | 4 | 12345 |
After cumulative_cost >= threshold condition:
date | cost_amt | cumulative_cost | threshold_reached_order | product_id |
---|---|---|---|---|
9/07/2023 | 14.09 | 14.09 | 0 | 12345 |
9/07/2023 | 10.2 | 24.29 | 0 | 12345 |
9/07/2023 | 25.03 | 49.32 | 3 | 12345 |
11/07/2023 | 28.09 | 77.41 | 4 | 12345 |
You can try using this query:
SELECT
date,
cost_amt,
cumulative_cost,
CASE
WHEN cumulative_cost >= threshold THEN
SUM(CASE WHEN cumulative_cost >= threshold THEN 1 ELSE 0 END)
OVER (PARTITION BY product_id ORDER BY date, order_id)
ELSE 0
END AS threshold_reached_order,
product_id
FROM table1
WHERE userid = 'ABCDEFG'
ORDER BY userid, product_id, date, order_id;