rdplyrpivottidyverse

pivot_longer into several pairs of columns


I'm needing to pivot_longer across multiple groups of columns, creating multiple names--values pairs.

For instance, I need to go from something like this:

df_raw <- tribble(
  ~id, ~belief_dog, ~belief_bull_frog, ~belief_fish, ~age, ~norm_bull_frog, ~norm_fish, ~norm_dog, ~gender,
  "b2x8",    1,           4,          3,         41,     4,       2,          10,         2,
  "m89w",    3,           6,          2,         19,     1,       2,           3,         1,
  "32x8",    1,           5,          2,         38,     9,       1,           8,         3
)

And turn it into something lie this:

df_final <- tribble(
  ~id,   ~belief_animal, ~belief_rating, ~norm_animal, ~norm_rating, ~age,   ~gender,
  "b2x8",    "dog",           1,          "bull_frog",      4,        41,       2,
  "b2x8",    "bull_frog",     4,          "fish",           2,        41,       2,
  "b2x8",    "fish",          3,          "dog",            10,       41,       2,
  "m89w",    "dog",           3,          "bull_frog",      1,        19,       1,
  "m89w",    "bull_frog",     6,          "fish",           2,        19,       1,
  "m89w",    "fish",          2,          "dog",            3,        19,       1,
  "32x8",    "dog",           1,          "bull_frog",      9,        38,       3,
  "32x8",    "bull_frog",     5,          "fish",           1,        38,       3,
  "32x8",    "fish",          2,          "dog",            8,        38,       3
)

In other words, anything starting with "belief_" should get pivoted in one names--values pair & anything starting with "norm_" should be pivoted into another names--values pair.

I tried looking at several other Stack Overflow pages with somewhat related content but wasn't able to translate those solutions to this situation.

Any help would be appreciated, with a strong preference for dplyr solutions.

THANKS!


Solution

  • With tidyverse, you can pivot on the two sets of columns that starts with belief and norm. Then, use regex to split into groups according to the first underscore (since some column names have multiple underscores). Essentially, we are putting belief or norm (the first group in the column name) into their own columns (i.e., .value), then the second part of the group (i.e., animal names) are put into one column named animal.

    library(tidyverse)
    
    df_raw %>%
      pivot_longer(cols = c(starts_with("belief"), starts_with("norm")),
                   names_to = c('.value', 'animal'),
                   names_pattern = '(.*?)_(.*)') %>% 
      rename(belief_rating = belief, norm_rating = norm)
    

    Output

      id      age gender animal    belief_rating norm_rating
      <chr> <dbl>  <dbl> <chr>             <dbl>       <dbl>
    1 b2x8     41      2 dog                   1          10
    2 b2x8     41      2 bull_frog             4           4
    3 b2x8     41      2 fish                  3           2
    4 m89w     19      1 dog                   3           3
    5 m89w     19      1 bull_frog             6           1
    6 m89w     19      1 fish                  2           2
    7 32x8     38      3 dog                   1           8
    8 32x8     38      3 bull_frog             5           9
    9 32x8     38      3 fish                  2           1