rdatetimelubridateperiod

How do I split time periods by 5 minute breaks (00:00, 00:05, 00:10, ...)?


I have a a number of periods which I would like to split by 5 minute intervals (00:00, 00:05, 00:10, 00:15, ...), if they span across such 5 minute intervals.

How do I turn

periods_raw <- read.table(header=TRUE, text="
COLOR           UTC_DateTime_Start      UTC_DateTime_End
Green           2022-06-15 14:56:23     2022-06-15 15:03:11
Purple          2022-06-15 15:03:11     2022-06-15 15:14:48
Red             2022-06-15 16:27:11     2022-06-15 16:29:48
")

into

periods_split <- read.table(header=TRUE, text="
COLOR           UTC_DateTime_Start      UTC_DateTime_End
Green           2022-06-15 14:56:23     2022-06-15 15:00:00
Green           2022-06-15 15:00:00     2022-06-15 15:03:11
Purple          2022-06-15 15:03:11     2022-06-15 15:05:00
Purple          2022-06-15 15:05:00     2022-06-15 15:10:00
Purple          2022-06-15 15:10:00     2022-06-15 15:14:48
Red             2022-06-15 16:27:11     2022-06-15 16:29:48
")

Solution

  • If the 'UTC_DateTime' columns are not Datetime class (POSIXct), convert (ymd_hms from lubridate), then loop over the columns 'start', 'end' (suffix), create a sequence by '5 minute', use floor_date for the elements that are not the first or the last, create a tibble by removing the last and first observation from the sequence to create the new 'Start', 'End' columns and unnest the list column

    library(dplyr)
    library(tidyr)
    library(lubridate)
    library(purrr)
    periods_raw %>% 
      mutate(across(starts_with("UTC_DateTime"), ymd_hms)) %>% 
      mutate(new = map2(UTC_DateTime_Start, UTC_DateTime_End, ~ {
         v1 <- c(seq(.x, .y, by = "5 min"), .y)
         v1[-c(1, length(v1))] <- floor_date(v1[-c(1, length(v1))], "5 min")
       tibble(UTC_DateTime_Start = v1[-length(v1)], UTC_DateTime_End = v1[-1]) 
        }), .keep = "unused") %>% 
      unnest(new)
    

    -output

    # A tibble: 6 × 3
      COLOR  UTC_DateTime_Start  UTC_DateTime_End   
      <chr>  <dttm>              <dttm>             
    1 Green  2022-06-15 14:56:23 2022-06-15 15:00:00
    2 Green  2022-06-15 15:00:00 2022-06-15 15:03:11
    3 Purple 2022-06-15 15:03:11 2022-06-15 15:05:00
    4 Purple 2022-06-15 15:05:00 2022-06-15 15:10:00
    5 Purple 2022-06-15 15:10:00 2022-06-15 15:14:48
    6 Red    2022-06-15 16:27:11 2022-06-15 16:29:48
    

    data

    periods_raw <- structure(list(COLOR = c("Green", "Purple", "Red"), UTC_DateTime_Start = c("2022-06-15 14:56:23", 
    "2022-06-15 15:03:11", "2022-06-15 16:27:11"), UTC_DateTime_End = c("2022-06-15 15:03:11", 
    "2022-06-15 15:14:48", "2022-06-15 16:29:48")), class = "data.frame", row.names = c(NA, 
    -3L))