I need to calculate the distinct names an account had, where I have the date of each update the account had:
Date | ACCOUNT | NAME |
---|---|---|
2024-01-01 | A | XOXO |
2024-01-02 | A | XOXO |
2024-01-02 | A | OXXO |
2024-01-04 | A | XOOX |
2024-01-05 | A | OOXO |
2024-01-06 | A | XOXO |
So for the date 2024-01-01 prior to that date, I don't have any record with account A with another NAME but for the records on day 2024-01-02 for both records I already have a unique record being XOXO, so on the third day, I only have 2 distinct NAMES prior that date.
So the result of my query has to be something like this:
DATE | ACCOUNT | COUNT DISTINCT NAME |
---|---|---|
2024-01-01 | A | NULL |
2024-01-02 | A | 1 |
2024-01-04 | A | 2 |
2024-01-05 | A | 3 |
2024-01-06 | A | 4 |
I've already tried PARTITION BY ACCOUNT ORDER BY NAME
.
And this condition counts every name as a unique name, even if it's repeated like XOXO
DENSE_RANK() OVER (Partition by ACCOUNT Order by NAME ASC) +
DENSE_RANK() OVER (Partition by ACCOUNT Order by NAME DESC) - 1 as COUNTP
This condition always returned the total distinct Names per record, even for the date 2024-02-02 where I was expecting just one record counted.
Regards!
You can do it without a correlated sub-query by using the ROW_NUMBER
analytic function to number the occurrences of each name/account pairing and then, using an inline-view, to total the first appearances of each pairing:
SELECT DISTINCT
dt,
account,
SUM(CASE rn WHEN 1 THEN 1 END) OVER (
PARTITION BY account
ORDER BY dt
RANGE BETWEEN UNBOUNDED PRECEDING AND INTERVAL '1' DAY PRECEDING
) AS count_distinct_name
FROM (
SELECT dt,
account,
ROW_NUMBER() OVER (PARTITION BY account, name ORDER BY dt) AS rn
FROM table_name
)
ORDER BY account, dt;
Which, for the sample data:
CREATE TABLE table_name (Dt, ACCOUNT, NAME) AS
SELECT DATE '2024-01-01', 'A', 'XOXO' FROM DUAL UNION ALL
SELECT DATE '2024-01-02', 'A', 'XOXO' FROM DUAL UNION ALL
SELECT DATE '2024-01-02', 'A', 'OXXO' FROM DUAL UNION ALL
SELECT DATE '2024-01-04', 'A', 'XOOX' FROM DUAL UNION ALL
SELECT DATE '2024-01-05', 'A', 'OOXO' FROM DUAL UNION ALL
SELECT DATE '2024-01-06', 'A', 'XOXO' FROM DUAL;;
Outputs:
DT | ACCOUNT | COUNT_DISTINCT_NAME |
---|---|---|
2024-01-01 00:00:00 | A | null |
2024-01-02 00:00:00 | A | 1 |
2024-01-04 00:00:00 | A | 2 |
2024-01-05 00:00:00 | A | 3 |
2024-01-06 00:00:00 | A | 4 |