postgresqlretentionchurn

postgres simple churn/retention


I'm trying to create a simple user retention, I tried a lot of ways searching, but none of them seems efficient and most of them have a lot of calculations that I don't really need.

Input my table example: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=ec68bebf63280023e828ebb6cabd2d89 OR:

DROP TABLE IF EXISTS users_churn;
create table 
users_churn(
id SERIAL PRIMARY KEY, 
users varchar(10) not null,
dates timestamp
);
insert into users_churn(users, dates)
values
('1ab7', '2022-01-04 10:22'),
('1ab7', '2022-04-26 18:30'),
('1ab7', '2022-05-12 20:10'),
('1ab7', '2022-07-02 20:55'),
('3ac5', '2022-02-05 05:12'),
('3ac5', '2022-04-09 07:17'),
('3ac5', '2022-07-03 04:19');
select * from users_churn

Output that I expect:

+-------------------------------------------------+
|                --DESIRED OUTPUT                 |
+-------------------------------------------------+
| ('1ab7', '2022-01-04 10:22', 'first_purchase'), |
| ('1ab7', '2022-02-01 00:00', 'churn'),          |
| ('1ab7', '2022-03-01 00:00', 'churn'),          |
| ('1ab7', '2022-04-26 18:30', 'retained'),       |
| ('1ab7', '2022-05-12 20:10', 'retained'),       |
| ('1ab7', '2022-06-01 00:00', 'churn'),          |
| ('1ab7', '2022-07-02 20:55', 'retained'),       |
| ('3ac5', '2022-02-05 05:12', 'first_purchase'), |
| ('3ac5', '2022-03-01 00:00', 'churn'),          |
| ('3ac5', '2022-04-09 07:17', 'retained'),       |
| ('3ac5', '2022-05-01 00:00', 'churn'),          |
| ('3ac5', '2022-06-01 00:00', 'churn'),          |
| ('3ac5', '2022-07-03 04:19', 'retained');       |
+-------------------------------------------------+

Solution

  • It's not clear what you want done if there is more that one purchase by a user in a month. This version keeps only the first one. If you want to keep all purchases, then get rid of the rn_month stuff:

    with churn as (
      select c.users, 
             c.dates, 
             date_trunc('month', c.dates) as mdate,
             row_number() over (partition by c.users 
                                    order by c.dates) as rn_all,
             row_number() over (partition by c.users, date_trunc('month', c.dates)
                                    order by c.dates) as rn_month
        from users_churn c
    ), months as (
      select generate_series(
               min(c.mdate),
               current_date,
               interval '1 month'
             )::timestamp as mdate
        from churn c
    )
    select c.users,
           coalesce(c1.dates, m.mdate) as dates,
           case 
             when c1.rn_all = 1 then 'first_purchase'
             when c1.users is null then 'churn'
             else 'retained'
           end as status
      from churn c
           join months m 
             on m.mdate >= c.mdate 
            and c.rn_all = 1
           left join churn c1 
             on c1.users = c.users
            and c1.mdate = m.mdate
            and c.rn_month = 1
     order by c.users, m.mdate;
    

    db<>fiddle here