I'm trying to select the count of users retaining in a certain date range and that was successful using this query and the table below:
+----------+-------------+
| Field | Type |
+----------+-------------+
| id | varchar(17) |
| log_date | date |
+----------+-------------+
SELECT last_day.log_date, COUNT(distinct last_day.id) as users_num
FROM (SELECT DISTINCT log_date, id
FROM `userActivity`) this_day
JOIN (SELECT DISTINCT log_date, id
FROM `userActivity`) last_day
ON this_day.id = last_day.id
AND this_day.log_date = "2018-10-01"
AND last_day.log_date BETWEEN "2018-10-01" AND "2018-10-30"
GROUP BY log_date;
But the problem that I'm facing is that I want the assume that every day of the date-range is day 0. (similar to the following example):
Note that the first row in the pic is the avg of the below results I need to calculate. Anyone has any idea how can I enhance my query to get the result like the picture?
This solution will only work on MySQL 8.x only since it requires CTEs (Common Table Expressions):
with digits as (
select 0 as n union select 1 union select 2 union select 3 union select 4
union select 5 union select 6 union select 7 union select 8 union select 9
),
series as (
select d1.n * 100 + d2.n * 10 + d3.n as n -- number series from 0 to 999
from digits d1
cross join digits d2
cross join digits d3
)
SELECT last_day.log_date, COUNT(distinct last_day.id) as users_num,
date_add("2018-10-01", interval s.n day) as current_start
FROM (SELECT DISTINCT log_date, id
FROM `userActivity`) this_day
JOIN (SELECT DISTINCT log_date, id
FROM `userActivity`) last_day
ON this_day.id = last_day.id
cross join series s
WHERE s.n <= 30
AND this_day.log_date = date_add("2018-10-01", interval s.n day)
AND last_day.log_date BETWEEN date_add("2018-10-01", interval s.n day)
AND date_add("2018-10-30", interval s.n day)
GROUP BY log_date, date_add("2018-10-01", interval s.n day);