I try to fill variable value with the value that belong to the same ID. For some reason my code did not work. No error message but no fill happen.
Could any one tell me why. Did I mess up anything?
here is my sample data and code:
df <- structure(list(SUBJECT = structure(c("101", "101",
"101", "401", "101", "401"), label = "Subject name or identifier",
format.sas = "$"), PEHEIGHT = structure(c("162.6", "", "", "176", "", ""),
label = "PEHEIGHT (Character)", format.sas = "$"),
PEDT = structure(c("27 DEC 2023", "10 JAN 2024", "24 JAN 2024",
"23 JAN 2024", "07 FEB 2024", "8 FEB 2024"), label = "PEDT (Character)",
format.sas = "$")), row.names = c(NA, -6L),
class = c("tbl_df", "tbl", "data.frame"))
df %>%
dplyr::group_by(SUBJECT) %>%
tidyr::fill(PEHEIGHT,.direction = "downup")
The issue is, that the missing values in PEHEIGHT
are empty strings "" and not NA
. Run
df$PEHEIGHT[df$PEHEIGHT == ""] <- NA
before and it works like a charm. Or to do this for the whole dataframe you could use df[df==""]<-NA
library(tidyverse)
df <- structure(list(SUBJECT = structure(c("101", "101", "101", "401", "101", "401"),
label = "Subject name or identifier", format.sas = "$"),
PEHEIGHT = structure(c("162.6", "", "", "176", "", ""),
label = "PEHEIGHT (Character)", format.sas = "$"),
PEDT = structure(c("27 DEC 2023", "10 JAN 2024", "24 JAN 2024",
"23 JAN 2024", "07 FEB 2024", "8 FEB 2024"),
label = "PEDT (Character)", format.sas = "$")),
row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
df$PEHEIGHT[df$PEHEIGHT == ""] <- NA
df %>%
dplyr::group_by(SUBJECT) %>%
tidyr::fill(PEHEIGHT, .direction = "downup")
# A tibble: 6 × 3
# Groups: SUBJECT [2]
SUBJECT PEHEIGHT PEDT
<chr> <chr> <chr>
1 101 162.6 27 DEC 2023
2 101 162.6 10 JAN 2024
3 101 162.6 24 JAN 2024
4 401 176 23 JAN 2024
5 101 162.6 07 FEB 2024
6 401 176 8 FEB 2024