So, my data looks like the following:
Period | Date |
---|---|
1 | 01-01-2020 |
1 | 02-01-2020 |
1 | 03-01-2020 |
2 | 04-01-2020 |
2 | 05-01-2020 |
2 | 06-01-2020 |
3 | 07-01-2020 |
3 | 08-01-2020 |
3 | 09-01-2020 |
4 | 10-01-2020 |
4 | 11-01-2020 |
4 | 12-01-2020 |
The period variable is always a sequence of 3 numbers. I'm trying to make a new column that takes the last Date value from n periods and repeats it, i.e. imagine that I want do build this variable accounting 2 periods:
Period | New_var |
---|---|
1 | 06-01-2020 |
1 | 06-01-2020 |
1 | 06-01-2020 |
2 | 06-01-2020 |
2 | 06-01-2020 |
2 | 06-01-2020 |
3 | 12-01-2020 |
3 | 12-01-2020 |
3 | 12-01-2020 |
4 | 12-01-2020 |
4 | 12-01-2020 |
4 | 12-01-2020 |
This variable would take the last Date observation and repeated across 2 periods in this case. Thank you
I've tried grouping by Period and the mutate the variable as:
rep(last(Date),each=3*2, length.out = n())
This doesn't work, tho.
Here is a full solution using a function and with the data you provided:
library(tidyverse)
# importing the data you provided
date_df <- read_table(
"Period Date
1 01-01-2020
1 02-01-2020
1 03-01-2020
2 04-01-2020
2 05-01-2020
2 06-01-2020
3 07-01-2020
3 08-01-2020
3 09-01-2020
4 10-01-2020
4 11-01-2020
4 12-01-2020"
) %>%
mutate(Date = as.Date(Date, format = "%d-%m-%Y")) # important that Date variable has the Date class
# function so that n can be dynamic
group_dates <- function(date_df, n) {
date_df %>%
mutate(Period_cut = cut_interval(Period, length = n)) %>%
group_by(Period_cut) %>%
mutate(New_var = max(Date)) %>%
ungroup() %>%
select(-Date, -Period_cut)
}
# run with n = 2
group_dates(date_df, 2)