rdplyrmutaterep

dplyr mutate sequence by n levels of grouping by variable


So, my data looks like the following:

Period Date
1 01-01-2020
1 02-01-2020
1 03-01-2020
2 04-01-2020
2 05-01-2020
2 06-01-2020
3 07-01-2020
3 08-01-2020
3 09-01-2020
4 10-01-2020
4 11-01-2020
4 12-01-2020

The period variable is always a sequence of 3 numbers. I'm trying to make a new column that takes the last Date value from n periods and repeats it, i.e. imagine that I want do build this variable accounting 2 periods:

Period New_var
1 06-01-2020
1 06-01-2020
1 06-01-2020
2 06-01-2020
2 06-01-2020
2 06-01-2020
3 12-01-2020
3 12-01-2020
3 12-01-2020
4 12-01-2020
4 12-01-2020
4 12-01-2020

This variable would take the last Date observation and repeated across 2 periods in this case. Thank you

I've tried grouping by Period and the mutate the variable as:

rep(last(Date),each=3*2, length.out = n())

This doesn't work, tho.


Solution

  • Here is a full solution using a function and with the data you provided:

    
    library(tidyverse)
    
    # importing the data you provided
    date_df <- read_table(
      "Period   Date
    1   01-01-2020
    1   02-01-2020
    1   03-01-2020
    2   04-01-2020
    2   05-01-2020
    2   06-01-2020
    3   07-01-2020
    3   08-01-2020
    3   09-01-2020
    4   10-01-2020
    4   11-01-2020
    4   12-01-2020"
    ) %>%
      mutate(Date = as.Date(Date, format = "%d-%m-%Y")) # important that Date variable has the Date class
    
    # function so that n can be dynamic
    group_dates <- function(date_df, n) {
      
      date_df %>%
        mutate(Period_cut = cut_interval(Period, length = n)) %>%
        group_by(Period_cut) %>%
        mutate(New_var = max(Date)) %>%
        ungroup() %>%
        select(-Date, -Period_cut)
      
    }
    
    # run with n = 2
    group_dates(date_df, 2)