tidyrdata-wrangling

Advanced pivot_longer transformation sequentially on a group of columns


I'm a little perplex concerning the exact way to proceed with this wrangling procedure.

I've a dataset which consist in raters that are assessing lung sounds (S1,...,S40). For each sound the assessed some quality of the recording (_A, _B, _X).

The dataframe looks like this bd example (I gave an example with 3 raters, scoring 2 sounds for 2 different characteristics A and B). You can see that each row represents 1 rater results for the different sounds (S_1, S_2) and sounds characteristics (_A, _B).

bd <- data.frame(Operator=c("Rater1", "Rater2", "Rater3"),
                    S1_A=c("Good", "Good", "Bad"), #quality assessment
                    S1_B=c(3,2,3), #confidence in assessment 1-5 likert scale          
                    S2_A=c("Don't know", "Bad", "Don't know"),
                    S2_B=c(4,4,2))

I'd like to get a result like that:

Operator Sound A B
Rater1 1 Good 3
Rater1 2 Don't know 4
Rater2 1 Good 2
Rater2 2 Bad 4
Rater3 1 Bad 3
Rater3 2 Don't know 2

I'm not able to design adequately the code to first create a column identifying the sound and then adding the sounds characteristics.

Thanks in advance for your help!!!


Solution

  • You can use names_pattern = with a regex to achieve your desired outcome. In this example:

    Note that this works for your sample data, but you mention _A, _B, and _X as suffixes. To include _X (or any other suffixes), simply add them to the second capture e.g. names_pattern = "S(\\d+)_(A|B|X)".

    library(tidyr)
      
    bd |>
      pivot_longer(cols = -Operator,
                   names_to = c("Sound", ".value"),
                   names_pattern = "S(\\d+)_(A|B)")
    
    # # A tibble: 6 × 4
    #   Operator Sound   A              B
    #   <chr>    <chr>   <chr>      <dbl>
    # 1 Rater1   1       Good           3
    # 2 Rater1   2       Don't know     4
    # 3 Rater2   1       Good           2
    # 4 Rater2   2       Bad            4
    # 5 Rater3   1       Bad            3
    # 6 Rater3   2       Don't know     2