I have dataframe like the following:
> head(n)
# A tibble: 6 x 23
`Record ID of REGN DATA` Pain_1 Pain_2 Redness_1 Redness_2 Swelling_1 Swelling_2
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 Yes Yes No No No No
2 2 No Yes No No No No
3 3 Yes No No No No No
4 4 Yes Yes No No Yes Yes
5 5 No No No No No No
6 6 No No No No No No
Pain_1
and Pain_2
are the answers for the symptom "pain" of the patient during their first visit and second visit, respectively. The same is true for the variables Redness
and Swelling
. I want to convert the dataframe as longitudinal for each of the symptoms Pain
, Redness
and Swelling
and a flag variable to show the visit number like the following. I tried with gather
function which converts all symptoms in one column. Can someone kindly help
> head(tr)
# A tibble: 6 x 5
`Record ID of REGN DATA` Pain Redness Swelling Visit
<dbl> <chr> <chr> <chr> <dbl>
1 1 Yes No No 1
2 2 No No No 1
3 3 Yes No No 1
4 1 Yes No No 2
5 2 Yes No No 2
6 3 No No No 2
This is the sample data
structure(list(`Record ID of REGN DATA` = c(1, 2, 3, 4, 5, 6,
7, 8, 9, 10), Pain_1 = c("Yes", "No", "Yes", "Yes", "No", "No",
"Yes", "Yes", "Yes", "Yes"), Pain_2 = c("Yes", "Yes", "No", "Yes",
"No", "No", "No", "Yes", "Yes", "Yes"), Redness_1 = c("No", "No",
"No", "No", "No", "No", "Yes", "Yes", "No", "No"), Redness_2 = c("No",
"No", "No", "No", "No", "No", "No", "Yes", "No", "No"), Swelling_1 = c("No",
"No", "No", "Yes", "No", "No", "No", "Yes", "No", "Yes"), Swelling_2 = c("No",
"No", "No", "Yes", "No", "No", "No", "Yes", "No", "Yes"), Tiredness_1 = c("Yes",
"No", "Yes", "Yes", "No", "No", "No", "No", "No", "Yes"), Tiredness_2 = c("Yes",
"Yes", "No", "Yes", "No", "No", "Yes", "No", "Yes", "Yes"), Headache_1 = c("No",
"No", "No", "Yes", "No", "No", "No", "No", "No", "No"), Headache_2 = c("No",
"Yes", "No", "Yes", "No", "No", "No", "Yes", "No", "No"), Muscle_1 = c("Yes",
"No", "Yes", "Yes", "No", "No", "No", "No", "No", "No"), Muscle_2 = c("Yes",
"Yes", "No", "No", "No", "No", "Yes", "Yes", "No", "No"), Chills_1 = c("No",
"No", "Yes", "No", "No", "No", "No", "No", "No", "No"), Chills_2 = c("No",
"Yes", "No", "No", "No", "No", "Yes", "No", "No", "No"), Fever_1 = c("Yes",
"No", "No", "No", "No", "No", "No", "No", "Yes", "Yes"), Fever_2 = c("Yes",
"Yes", "No", "No", "No", "No", "Yes", "No", "No", "No"), Nausea_1 = c("No",
"No", "No", "No", "No", "No", "No", "No", "No", "No"), Nausea_2 = c("No",
"No", "No", "No", "No", "No", "No", "Yes", "No", "No"), JointPain_1 = c("Yes",
"No", "Yes", "No", "No", "No", "No", "No", "Yes", "No"), JointPain_2 = c("Yes",
"No", "No", "No", "No", "No", "Yes", "No", "No", "No"), `Allergic reaction_1` = c("No",
"No", "No", "No", "No", "No", "No", "No", "No", "No"), `Allergic reaction_2` = c("No",
"No", "No", "No", "No", "No", "No", "No", "No", "No")), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
A more direct way using the names_sep
argument of pivot_longer
:
library(tidyr)
df %>%
pivot_longer(
cols = -1,
names_to = c(".value", "visit"),
names_sep = "_"
)
`Record ID of REG~ visit Pain Redness Swelling Tiredness Headache Muscle Chills Fever Nausea JointPain `Allergic react~
<dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 1 Yes No No Yes No Yes No Yes No Yes No
2 1 2 Yes No No Yes No Yes No Yes No Yes No
3 2 1 No No No No No No No No No No No
4 2 2 Yes No No Yes Yes Yes Yes Yes No No No
5 3 1 Yes No No Yes No Yes Yes No No Yes No
6 3 2 No No No No No No No No No No No
7 4 1 Yes No Yes Yes Yes Yes No No No No No
8 4 2 Yes No Yes Yes Yes No No No No No No
9 5 1 No No No No No No No No No No No
10 5 2 No No No No No No No No No No No
11 6 1 No No No No No No No No No No No
12 6 2 No No No No No No No No No No No
13 7 1 Yes Yes No No No No No No No No No
14 7 2 No No No Yes No Yes Yes Yes No Yes No
15 8 1 Yes Yes Yes No No No No No No No No
16 8 2 Yes Yes Yes No Yes Yes No No Yes No No
17 9 1 Yes No No No No No No Yes No Yes No
18 9 2 Yes No No Yes No No No No No No No
19 10 1 Yes No Yes Yes No No No Yes No No No
20 10 2 Yes No Yes Yes No No No No No No No