I have data of repeated measurements (4 or 5 times) of 3 patients in a long format:
library(dplyr)
library(magrittr)
questiondata <- structure(list(ID = c(2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4,
4), time = c("time1", "time2", "time3", "time4", "time1", "time2",
"time3", "time4", "time5", "time1", "time2", "time3", "time4",
"time5"), drug_use = structure(c(NA, 1L, NA, NA, NA, 2L, NA,
NA, NA, NA, 1L, NA, NA, NA), .Label = c("no", "yes"), class = "factor")), row.names = c(NA,
-14L), class = c("tbl_df", "tbl", "data.frame"))
# Corresponding to the following tibble:
# A tibble: 14 x 3
ID time drug_use
<dbl> <chr> <fct>
1 2 time1 NA
2 2 time2 no
3 2 time3 NA
4 2 time4 NA
5 3 time1 NA
6 3 time2 yes
7 3 time3 NA
8 3 time4 NA
9 3 time5 NA
10 4 time1 NA
11 4 time2 no
12 4 time3 NA
13 4 time4 NA
14 4 time5 NA
I am trying to make a new column on drug use (named 'drug_use_2
) that says 'yes' or 'no' per patient, based on if that patient used drugs at any time
. So the desired result would look like this:
A tibble: 14 x 4
ID time drug_use drug_use2
<dbl> <chr> <fct> <fct>
1 2 time1 NA no
2 2 time2 no no
3 2 time3 NA no
4 2 time4 NA no
5 3 time1 NA yes
6 3 time2 yes yes
7 3 time3 NA yes
8 3 time4 NA yes
9 3 time5 NA yes
10 4 time1 NA no
11 4 time2 no no
12 4 time3 NA no
13 4 time4 NA no
14 4 time5 NA no
I have tried grouping on PXE number and/or on drug_use and then using mutate
with case when
, but that gets stuck on the missing values for drug_use:
questiondata <- questiondata %>%
group_by(ID) %>%
mutate(drug_use2=
case_when(
drug_use=="yes" ~ "yes",
drug_use=="no" ~ "no",
TRUE ~ "missing"))
# A tibble: 14 x 4
# Groups: ID [3]
ID time drug_use drug_use2
<dbl> <chr> <fct> <chr>
1 2 time1 NA missing
2 2 time2 no no
3 2 time3 NA missing
4 2 time4 NA missing
5 3 time1 NA missing
6 3 time2 yes yes
7 3 time3 NA missing
8 3 time4 NA missing
9 3 time5 NA missing
10 4 time1 NA missing
11 4 time2 no no
12 4 time3 NA missing
13 4 time4 NA missing
14 4 time5 NA missing
Am I using the group_by wrong? Thanks in advance
Use any
questiondata %>%
group_by(ID) %>%
mutate(druguse2 = case_when(any(drug_use == 'yes') ~ 'yes',
TRUE ~ 'no'))
# A tibble: 14 x 4
# Groups: ID [3]
ID time drug_use druguse2
<dbl> <chr> <fct> <chr>
1 2 time1 NA no
2 2 time2 no no
3 2 time3 NA no
4 2 time4 NA no
5 3 time1 NA yes
6 3 time2 yes yes
7 3 time3 NA yes
8 3 time4 NA yes
9 3 time5 NA yes
10 4 time1 NA no
11 4 time2 no no
12 4 time3 NA no
13 4 time4 NA no
14 4 time5 NA no