rdatetime

More efficient way to compare if a DateTime is in between any of two columns of DateTimes in R?


I have a ~38,000 observation dataframe (hr_clean) that has a POSIXct value for each row, this is fitbit data that tracks average Heart Rate over every minute for the duration of a month.

I want to check if the Date and Time of each row of hr_clean is during sleeping or waking hours. I have a dataframe of sleep ranges that I will use to make this comparison (sleep_ranges).

For each row I need to check if the Date and Time is in between the sleep time and wake time (POSIXct objects), to determine if the subject was asleep during that minute.

I wrote a for loop that effectively does this; however it is incredibly slow when running a comparison of 38,000 by 26 sleep ranges. Please help me make this code efficient.

For Loop:

# Set seed for reproducibility
set.seed(123)

# Define the start and end dates
start_date <- ymd_hms("2025-03-01 00:00:00")
end_date <- ymd_hms("2025-04-01 23:59:59")

hr_clean <- data.frame(dateTime = start_date + runif(10, 0, as.numeric(difftime(end_date, start_date, units = "secs"))))


sleep_ranges <- data.frame(startTime = ymd_hms(c("2025-03-15 00:58:00",
                                                 "2025-03-14 00:17:00",
                                                 "2025-03-12 23:38:00",
                                                 "2025-03-12 00:21:30")),
                           endTime = ymd_hms(c("2025-03-15 09:10:30",
                                               "2025-03-14 07:30:30",
                                               "2025-03-13 07:45:30",
                                               "2025-03-12 07:26:30")))


is_sleeping_df <- data.frame(dateTime = character(),
                             is_sleeping = character())

for(minute in hr_clean$dateTime){
  print(as_datetime(minute))
  sleepTF = (any(minute > sleep_ranges$startTime & minute < sleep_ranges$endTime))
  
  df <- data.frame(dateTime = as_datetime(minute),
                   is_sleeping = sleepTF)
  
  is_sleeping_df <- rbind(is_sleeping_df, df)
  
  
}

Solution

  • You might use | in Reduce with data.table.

    > library(data.table)
    > setDT(hr_clean); setDT(sleep_ranges); 
    > 
    > hr_clean[, asleep := Reduce(`|`, lapply(seq_len(nrow(sleep_ranges)), \(i)
    +                                         dateTime %between% sleep_ranges[i]))][]
                   dateTime asleep
                     <POSc> <lgcl>
     1: 2025-03-10 04:34:18  FALSE
     2: 2025-03-26 04:37:47  FALSE
     3: 2025-03-14 01:41:06   TRUE
     4: 2025-03-29 05:16:26  FALSE
     5: 2025-03-31 02:20:17  FALSE
     6: 2025-03-02 10:56:30  FALSE
     7: 2025-03-17 21:03:24  FALSE
     8: 2025-03-29 12:29:06  FALSE
     9: 2025-03-18 14:57:01  FALSE
    10: 2025-03-15 14:13:24  FALSE
    

    Same logic in base R, which is slightly slower (~1.5 times).

    > hr_clean |> 
    +   transform(
    +     asleep=Reduce(`|`, lapply(seq_len(nrow(sleep_ranges)), \(i)
    +                               dateTime > sleep_ranges[i, 1] & 
    +                                 dateTime < sleep_ranges[i, 2]))
    +   )
                  dateTime asleep
    1  2025-03-10 04:34:18  FALSE
    2  2025-03-26 04:37:47  FALSE
    3  2025-03-14 01:41:06   TRUE
    4  2025-03-29 05:16:26  FALSE
    5  2025-03-31 02:20:17  FALSE
    6  2025-03-02 10:56:30  FALSE
    7  2025-03-17 21:03:24  FALSE
    8  2025-03-29 12:29:06  FALSE
    9  2025-03-18 14:57:01  FALSE
    10 2025-03-15 14:13:24  FALSE
    

    Alternatively outer in base R, but it's much slower.

    > hr_clean |> 
    +   transform(
    +   asleep=rowSums(
    +     outer(hr_clean$dateTime, 
    +           asplit(sleep_ranges, 1), 
    +           Vectorize(\(x, y) x > y[1] & x < y[2]))
    +     ) > 0
    + )
                   dateTime asleep
                     <POSc> <lgcl>
     1: 2025-03-10 04:34:18  FALSE
     2: 2025-03-26 04:37:47  FALSE
     3: 2025-03-14 01:41:06   TRUE
     4: 2025-03-29 05:16:26  FALSE
     5: 2025-03-31 02:20:17  FALSE
     6: 2025-03-02 10:56:30  FALSE
     7: 2025-03-17 21:03:24  FALSE
     8: 2025-03-29 12:29:06  FALSE
     9: 2025-03-18 14:57:01  FALSE
    10: 2025-03-15 14:13:24  FALSE