rcombinations

pivot_wider in R, pivot all combinations


I have this data. Lots of people share the same id, but have different details about themselves recorded (name and month are often different).

data.frame(
  id = c("a", "a", "a", "a", "b", "b"),
  name = c("jane", "laura", "lauran", "lilly", "james", "jimmy"),
  month = c("april", "april", "may", "june", "june", "june")
)

id name month
a jane april
a laura april
a lauran may
a lilly june
b james june
b jimmy june

I want a way of pivoting this wider, so each combination of name for each id appears (note: do not need to use pivot_wider if there is a better way).

This data has 1000s of rows in reality.

id name_1 name_2 month_1 month_2 notes
a jane laura april april
a jane lauran april may
a jane lilly april june
a laura jane april april * see below
a laura lauran april may
a laura lilly april june
a lauran jane may april
a lauran laura may april
a lauran lilly may june
a lilly jane june april
a lilly laura june april
a lilly lauran june may
b james jimmy june june
b jimmy james june june

*ideally solution would not have this row because this combination of names lauran/jane appears in row 1 already jane/lauran etc etc etc on future rows. However this solution would be ok, if that is as far as we can get.

Some sort of dplyr answer would be amazing but ok without!


Solution

  • Here you go. Easier to use a many-to-many merge than pivot wider in this case.

    library(dplyr, warn.conflicts = FALSE) |> 
      suppressWarnings()
    
    dat <- data.frame(
      id = c(rep("a", 4), rep("b",2)), 
      name = c("jane", "laura", "lauran", "lilly", "james", "jimmy"),
      month = c("april", "april", "may", rep("june",3))
    )
    
    dat |> 
      # Created row numbers to allow filtering later. Note use of magritta pipe here.
      mutate(row = row_number(), .by = id) %>% 
      left_join(., ., by = "id", relationship = "many-to-many", suffix = c("_1", "_2")) |> 
      filter(row_2 > row_1) |> 
      select(id, starts_with("name"), starts_with("month"))
    #>   id name_1 name_2 month_1 month_2
    #> 1  a   jane  laura   april   april
    #> 2  a   jane lauran   april     may
    #> 3  a   jane  lilly   april    june
    #> 4  a  laura lauran   april     may
    #> 5  a  laura  lilly   april    june
    #> 6  a lauran  lilly     may    june
    #> 7  b  james  jimmy    june    june
    

    Created on 2024-10-29 with reprex v2.1.1