rlongitudinal

Turn wide format into longitudinal data format


I have a dataset in wide format with no time variable and I would like to create a time variable and turn it into long format for longitudinal analysis. The original dataset looks like this:

id <- c(1,2,3)
pdq1 <- c(3,5,6)
pdq2 <- c(1,3,4)
pdq3 <- c(4,5,4)
scor_abp1 <- c(1,2,3)
scor_abp2 <- c(2,2,4)
scor_abp3 <- c(1,4,5)
dat <- data.frame(id,pdq1,pdq2,pdq3,scor_abp1,scor_abp2,scor_abp3)

Desired output:

id <- c(1,1,1,2,2,2,3,3,3)
time <- c(1,2,3,1,2,3,1,2,3)
pdq <- c(3,1,4,5,3,5,6,4,4)
abp <- c(1,2,1,2,2,4,3,4,5)
dat <- data.frame(id,time,pdq,abp)

pdq and scor_abp are two different measures. The variables with suffix 1 means it is measured at Time 1, with suffix 2 means it is measured at Time 2, etc.

I would appreciate all the help there is!!! Thanks in advance!


Solution

  • When you have multiple columns to pivot, you can use .value in pivot_longer. In names_to, ".value" relates to the multiple columns you want to pivot, which are followed by a digit (the time column). This all needs to be set up in the names_to and names_pattern arguments.

    tidyr::pivot_longer(dat,
                        -id,
                        names_to = c(".value", "time"),
                        names_pattern = "(pdq|scor_abp)(\\d)")
    
    # A tibble: 9 × 4
         id time    pdq scor_abp
      <dbl> <chr> <dbl>    <dbl>
    1     1 1         3        1
    2     1 2         1        2
    3     1 3         4        1
    4     2 1         5        2
    5     2 2         3        2
    6     2 3         5        4
    7     3 1         6        3
    8     3 2         4        4
    9     3 3         4        5