I want to create a df with all of the unique combinations of three columns where the order of the value doesn't matter. In my example, I want to create a list of all the combinations of ideology groups of three people could have.
In my example, "No opinion", "Moderate", "Conservative" is the same as "Conservative" "No opinion" "Moderate" which is the same as "Moderate", "No opinion", "Conservative", etc. all of these combinations should be represented by one row.
I've seen similar threads about using distinct
for home and away sports teams, but I don't think this is working for this problem.
library(tidyverse)
political_spectrum_values =
factor(c("Far left",
"Liberal",
"Moderate",
"Conservative",
"Far right",
"No opinion"),
ordered = T)
political_groups_of_3 <-
crossing(first_person = political_spectrum_values,
second_person = political_spectrum_values,
third_person = political_spectrum_values)
I've considered making some kind of combined variable by piping into this line, but I'm not sure how to take it from here
unite(col = "group_composition", c(first_person, second_person, third_person), sep = "_")
EDIT: After working with this problem longer I've reshaped the data in a way that might make this easier
crossing(first_person = political_spectrum_values,
second_person = political_spectrum_values,
third_person = political_spectrum_values) %>%
mutate(group_n = row_number()) %>%
pivot_longer(cols = c(first_person, second_person, third_person),
values_to = "ideology",
names_to = "group") %>%
select(-group)
A base R method is to create all the combination of political_spectrum_values
taking 3 at a time using expand.grid
, sort
them by row and select unique rows.
df <- expand.grid(first_person = political_spectrum_values,
second_person = political_spectrum_values,
third_person = political_spectrum_values)
df[] <- t(apply(df, 1, sort))
unique(df)
If needed as a single string
unique(apply(df, 1, function(x) paste0(sort(x), collapse = "_")))