rdplyrtidyverserolling-computationwindowing

Count observations over rolling 30 day window


I need to create a variable that counts the number of observations that have occurred in the last 30 days for each id.

For example, imagine an observation that occurs on 1/2/2021 (d / m / y) for the id "a". If this observation is the first between 1/1/2021 and 1/2/2021 for the id "a" the variable must give 1. If it is the second, 2, etc.

Here is a larger example:

dat <- tibble::tribble(
  ~id,  ~q,   ~date,
  "a",   1,   "01/01/2021",
  "a",   1,   "01/01/2021",
  "a",   1,   "21/01/2021",
  "a",   1,   "21/01/2021",
  "a",   1,   "12/02/2021",
  "a",   1,   "12/02/2021",
  "a",   1,   "12/02/2021",
  "a",   1,   "12/02/2021",
  "b",   1,   "02/02/2021",
  "b",   1,   "02/02/2021",
  "b",   1,   "22/02/2021",
  "b",   1,   "22/02/2021",
  "b",   1,   "13/03/2021",
  "b",   1,   "13/03/2021",
  "b",   1,   "13/03/2021",
  "b",   1,   "13/03/2021")
dat$date <- lubridate::dmy(dat$date)

The result should be:

id  q   date    newvar
a   1   01/01/2021  1
a   1   01/01/2021  2
a   1   21/01/2021  3
a   1   21/01/2021  4
a   1   12/02/2021  3
a   1   12/02/2021  4
a   1   12/02/2021  5
a   1   12/02/2021  6
b   1   02/02/2021  1
b   1   02/02/2021  2
b   1   22/02/2021  3
b   1   22/02/2021  4
b   1   13/03/2021  3
b   1   13/03/2021  4
b   1   13/03/2021  5
b   1   13/03/2021  6

Thank you very much.


Solution

  • With sapply and between, count the number of observations prior to the current observation that are within 30 days.

    library(lubridate)
    library(dplyr)
    dat %>% 
      group_by(id) %>% 
      mutate(newvar = sapply(seq(length(date)), 
                             function(x) sum(between(date[1:x], date[x] - days(30), date[x]))))
    
    # A tibble: 16 x 4
    # Groups:   id [2]
       id        q date       newvar
       <chr> <dbl> <date>      <int>
     1 a         1 2021-01-01      1
     2 a         1 2021-01-01      2
     3 a         1 2021-01-21      3
     4 a         1 2021-01-21      4
     5 a         1 2021-02-12      3
     6 a         1 2021-02-12      4
     7 a         1 2021-02-12      5
     8 a         1 2021-02-12      6
     9 b         1 2021-02-02      1
    10 b         1 2021-02-02      2
    11 b         1 2021-02-22      3
    12 b         1 2021-02-22      4
    13 b         1 2021-03-13      3
    14 b         1 2021-03-13      4
    15 b         1 2021-03-13      5
    16 b         1 2021-03-13      6