How can I create a new column in SQL (on Snowflake) where it counts the previous occurrences of an ID value for a time series. Should go something like this:
Input Data
| Date | ID |
| -------- | -------- |
| 1/1/2025 | a1 |
| 1/2/2025 | a2 |
| 1/3/2025 | a2 |
| 1/4/2025 | a1 |
| 1/5/2025 | a3 |
| 1/6/2025 | a1 |
Output Data
| Date | ID | Count |
| -------- | -------- | -------- |
| 1/1/2025 | a1 | 0 |
| 1/2/2025 | a2 | 0 |
| 1/3/2025 | a2 | 1 |
| 1/4/2025 | a1 | 1 |
| 1/5/2025 | a3 | 0 |
| 1/6/2025 | a1 | 2 |
Thanks!
All other answers I could find searching for this only counted if the ID val appeared in the the immediately previous row. I want to look back for the entire time series. It is a large dataset too so don't want to break the server.
Using cumulative COUNT:
SELECT *, COUNT(*) OVER(PARTITION BY ID ORDER BY Date) - 1 AS cnt
FROM tab;