I am trying to write a query to return ALL rows for an ID where a condition is met and a condition is not met for each ID on the Order table.
The conditions I want are to return all rows where the ID's where order was packed, and has not shipped yet.
status = 'order_packed'
status <> 'order_shipped'
Order Table
ID | status |
---|---|
1 | order_received |
1 | order_processed |
1 | order_packed |
1 | order_shipped |
2 | order_received |
2 | order_processed |
2 | order_packed |
2 | order_canceled |
3 | order_received |
3 | order_processed |
3 | order_packed |
4 | order_received |
4 | order_processed |
4 | order_packed |
4 | order_hold |
4 | order_shipped |
5 | order_received |
5 | order_processed |
6 | order_received |
6 | order_processed |
6 | order_hold |
I am expecting the output to be
Output
ID | status |
---|---|
2 | order_received |
2 | order_processed |
2 | order_packed |
2 | order_canceled |
3 | order_received |
3 | order_processed |
3 | order_packed |
I have tried the below:
SELECT
id
,status
FROM order_table
WHERE
id IN (SELECT DISTINCT id FROM order_table WHERE status IN ('order_packed'))
AND id IN (SELECT DISTINCT id FROM order_table WHERE status <> 'order_shipped')
Any help is appreciated. Thanks
One of the options is to use a single subquery which will group the data by id and perform filtering via the having
clause:
SELECT
id
, status
FROM order_table
WHERE id IN (
select id
from order_table
group by id
having count_if(status = 'order_packed') > 0 -- exists packed
AND count_if(status = 'order_shipped') = 0); -- no shipped
Output:
id | status |
---|---|
2 | order_received |
2 | order_processed |
2 | order_packed |
2 | order_canceled |
3 | order_received |
3 | order_processed |
3 | order_packed |
Note that this highly likely will need to scan your data twice (use explain
to check the plan). If you are ok with grouping the results into array you can do something like the following:
SELECT
id
, statuses
FROM (select id, array_agg(status) statuses
from order_table
group by id)
where contains (statuses, 'order_packed') AND NOT contains(statuses, 'order_shipped')
Output:
id | statuses |
---|---|
3 | [order_received, order_processed, order_packed] |
2 | [order_received, order_processed, order_packed, order_canceled] |