I'm trying to create a binary flag variable where this new column indicates if an ID has an item, from a long list of items. The table is called items
and exists in Snowflake.
My table looks like
ID | Items |
------------
1 | A |
1 | B |
1 | C |
1 | X |
2 | N |
2 | M |
2 | P |
3 | R |
3 | T |
3 | X |
I want to create column that shows which ID's have item X. My expected output is
ID | HasX |
-----------
1 | 1 |
2 | 0 |
3 | 1 |
I've tried a couple different options, but when I do
SELECT
i.ID,
CASE
WHEN Items IN ('X')
THEN 1 OVER (PARTITION BY ID)
ELSE 0
END AS HasX
FROM items i
which gives me an error Invalid function '1'
.
This is probably simple, but I haven't found a solution anywhere yet. I suspect I need to use a window function like this, but I haven't figured out how to with CASE WHEN.
This is called conditional aggregation.
SELECT ID,
SUM(CASE WHEN Items = 'X' THEN 1 ELSE 0 END) AS HasX
FROM items
GROUP BY ID;
You could use MAX
function as well.
SELECT ID,
MAX(CASE WHEN Items = 'X' THEN 1 ELSE 0 END) AS HasX
FROM items
GROUP BY ID;
Another option would be using EXISTS
SELECT DISTINCT ID,
CASE WHEN EXISTS ( SELECT 1
FROM items i2
WHERE i2.ID = i1.ID AND i2.Items = 'X'
) THEN 1
ELSE 0
END AS HasX
FROM items i1;
See example