rdplyrtidyrdata-wrangling

pivot_longer() with parallel (unlinked) sets of columns


I'm trying to use pivot_longer() to rearrange a dataset I was given, which looks like the result of a database join operation. Here's an example of what it looks like:

dat <- tibble('Plant_Name'=c('tree','grass','bush','moss','weed','algae'),
       'Animal_Name'=c('duck',NA,'horse',NA,NA,'duck'),
       'Plant_Info'=c('tall','short','bushy','fuzzy','bad',NA),
       'Animal_Info'=c('wet',NA,'fast',NA,NA,'wet'))

# # A tibble: 6 × 4
# Plant_Name Animal_Name Plant_Info Animal_Info
# <chr>      <chr>       <chr>      <chr>      
# 1 tree       duck        tall       wet        
# 2 grass      NA          short      NA         
# 3 bush       horse       bushy      fast       
# 4 moss       NA          fuzzy      NA         
# 5 weed       NA          bad        NA         
# 6 algae      duck        NA         wet     

These data are not tidy, and are essentially just two other dataframes ("Plant" and "Animal") mashed together. I'm interested in "unjoining" the two separate datasets, and I'd like it to look like this:

# # A tibble: 7 × 3
# Class Name  Info 
# <chr>   <chr> <chr>
# 1 Animal  duck  wet  
# 2 Animal  horse fast 
# 3 Plant   bush  bushy
# 4 Plant   grass short
# 5 Plant   moss  fuzzy
# 6 Plant   tree  tall 
# 7 Plant   weed  bad 

I'm essentially wanting to do a pivot_wider() operation on two sets of columns that are not linked to each other. I could do this by creating two separate dataframes from each pair of columns, pivoting, then using bind_rows() to rejoin them, but I think there's a way to do this in one fell swoop.

Here's the solution I've come up with. Problem is, using unite isn't really scalable:

dat %>% 
  #This part is clunky: how to apply this to N pairs?
  unite('Plant',contains('Plant')) %>% unite('Animal',contains('Animal')) %>% 
  pivot_longer(everything(),names_to = 'Class') %>% 
  separate_wider_delim(cols = value,delim = '_',names=c('Name','Info')) %>%
  mutate(across(c(Name,Info),~ifelse(.x=='NA',NA,.x))) %>% #Change character "NA" to NA
  na.omit() %>% #Get rid of missing rows
  arrange(Class,Name) %>% distinct()

Is there a proper way to do this in pivot_wider() that I'm not aware of?

Edit: this superficially question is similar to other "pivot multiple column" questions (such as this one, this one, and this one, as well as examples from Data Wrangling: Advanced Pivoting), but these questions deal with independent pairs of columns.


Solution

  • dat |>
      pivot_longer(everything(), names_to = c("Class", ".value"), 
                   names_sep = "_", cols_vary = "slowest")
    
    
    
     Class  Name  Info 
       <chr>  <chr> <chr>
     1 Plant  tree  tall 
     2 Plant  grass short
     3 Plant  bush  bushy
     4 Plant  moss  fuzzy
     5 Plant  weed  bad  
     6 Plant  algae NA   
     7 Animal duck  wet  
     8 Animal NA    NA   
     9 Animal horse fast 
    10 Animal NA    NA   
    11 Animal NA    NA   
    12 Animal duck  wet