I need to populate my data with new rows based on date. I discovered "complete seq.Date" from tidyr but I am getting a error. Below is a sample of my data.
Hospital Room dt
1 1 100 2023-05-24 10:08:38
2 1 100 2023-05-24 10:09:36
3 1 200 <NA>
4 1 300 2023-05-24 13:51:10
5 2 145 2023-05-24 18:51:10
6 2 145 2023-05-24 20:51:10
7 2 100 <NA>
d1 <- data.frame(Hospital = c(1, 1, 1, 1, 2, 2,2),
Room = c(100, 100, 200, 300, 145, 145, 100),
dt = as.POSIXct(c('2023-04-24 10:08:38',
'2023-05-10 10:09:36', '', '2023-05-10 13:51:10',
'2023-04-20 18:51:10', '2023-04-24 20:51:10',''),
format="%Y-%M-%d %H:%M:%S"))
The goal is that every hospital room have date values from 04/24/2023 to 05/10/2023 (min and max dates). Below is my attempt.
d2 <- d1 %>%
mutate(Date2 = as.Date(dt)) %>%
complete(Date2 = seq.Date(min(Date2, na.rm=T), max(Date2, na.rm=T), by="day"), Hospital, Room)
I added na.rm=T because I was getting a error message "'to must be a finite number". Although I get results, they are wrong. The combinations of date, hospital a room are incorrect. I believe this has to do because I have two "by" fields: hospital and room.
Thanks!
I was able to figure this out. First I created a date and ID variable (using Hospital and Room). See the code below.
library(dplyr)
d1 <- d1 %>%
mutate(ID = paste(Hospital, Room, sep = "-"),
dt2 = as.Date(dt)) %>%
complete(dt2 = seq.Date(min(dt2, na.rm=T), max(dt2, na.rm=T), by="day"), ID) %>%
arrange(ID, dt2) %>%
select(ID, Hospital, Room, dt2)
What I need now is to fill the NAs with their corresponding Hospital and Room value. I tried the fill function but it didn't work. Any ideas? Thanks! ex.
ID Hospital Room dt2
1 1-100 1 100 2023-05-10
2 1-100 NA NA 2023-05-11
3 1-100 NA NA 2023-05-12
4 1-100 NA NA 2023-05-13
5 1-100 NA NA 2023-05-14
6 1-100 NA NA 2023-05-15