mysqldatepartitioningrolling-computationlevel-of-detail

Comparing Dates by Fixed Level of Detail in mySQL


Good Morning,

I'm trying to create a mySQL query with the following output:

Click here to view desired table output

In order to get the 60_days_since_seen, I need to compare the reference_date for the current row to the next most recent reference_date relative to the unique_identifier and determine if that reference_date is older than 60 days from the reference_date from that row.

Ex: unique_identifier = 123456789

id = 1: Since there is no date before this instance, the 60_days_since_seen is TRUE (or 1).

id = 2: Since the date of the reference_date occurrence prior to this id relative to the unique_identifier is less than 60 days since this row's reference_date, the 60_days_since_seen is FALSE (or 0).

id = 3: Since the date of the reference_date occurrence prior to this id relative to the unique_identifier is greater than 60 days since this row's reference_date, the 60_days_since_seen is TRUE (or 1).

id = 4: Since there is no date before this instance, the 60_days_since_seen is TRUE (or 1).

id = 5: Since the date of the reference_date occurrence prior to this id relative to the unique_identifier is less than 60 days since this row's reference_date, the 60_days_since_seen is FALSE (or 0).


Solution

  • The first query works also with mysql 5.x.

    The second uses window functions that are only supported in mysql 8

    CREATE TABLE datetable (
      `Id` INTEGER,
      `unique_identifier` INTEGER,
      `reference_date` Date
    );
    
    INSERT INTO datetable
      (`Id`, `unique_identifier`, `reference_date`)
    VALUES
      ('1', '123456789', '2020-01-05'),
      ('2', '123456789', '2020-02-20'),
      ('3', '123456789', '2020-06-29'),
      ('4', '987654321', '2020-02-21'),
      ('5', '987654321', '2020-03-30');
    
    SELECT
    `Id`, `reference_date`
    ,IF(@unique_identifier = `unique_identifier`,IF(DATEDIFF(`reference_date`,@date) > 60,1,0),1) 60_days_since_seen
    ,@unique_identifier := `unique_identifier` as 'unique_identifier'
    ,@date :=`reference_date` 'reference_date'
    FROM
    (SELECT * FROM datetable ORDER BY unique_identifier,reference_date) t1
    ,(SELECT @unique_identifier := 0) a,(SELECT @date := NOW()) b
    
    Id | reference_date | 60_days_since_seen | unique_identifier | reference_date
    -: | :------------- | -----------------: | ----------------: | :-------------
     1 | 2020-01-05     |                  1 |         123456789 | 2020-01-05    
     2 | 2020-02-20     |                  0 |         123456789 | 2020-02-20    
     3 | 2020-06-29     |                  1 |         123456789 | 2020-06-29    
     4 | 2020-02-21     |                  1 |         987654321 | 2020-02-21    
     5 | 2020-03-30     |                  0 |         987654321 | 2020-03-30    
    
    SELECT
    `Id`, `unique_identifier`, `reference_date`,
       IF(DATEDIFF(reference_date , LAG(reference_date) OVER (
            PARTITION BY unique_identifier
            ORDER BY reference_date ))> 60 OR LAG(reference_date) OVER (
            PARTITION BY unique_identifier
            ORDER BY reference_date ) is NULL,1,0) 60_days_since_seen
    FROM datetable
    
    Id | unique_identifier | reference_date | 60_days_since_seen
    -: | ----------------: | :------------- | -----------------:
     1 |         123456789 | 2020-01-05     |                  1
     2 |         123456789 | 2020-02-20     |                  0
     3 |         123456789 | 2020-06-29     |                  1
     4 |         987654321 | 2020-02-21     |                  1
     5 |         987654321 | 2020-03-30     |                  0
    

    db<>fiddle here