I have a list of customer IDs, each with a list of unique products they used. There can theoretically be up to ~150 unique products.
df <- tibble(ID = c(1,1,1,2,2,3,3,4),
prod = c("Prod1", "Prod2", "Prod3", "Prod1", "Prod4", "Prod3", "Prod5", "Prod2"))
From that, I need to get all possible combinations of products for each ID, not only on the highest level (grouped by ID). That is, include the combination with all products, as expand_grid() would do, but also all combinations of 1,...,n elements, where n is the number of unique products the ID has.
Final dataset should therefore look as such:
df_results <- tibble(ID = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4),
combo = c("Prod1", "Prod2", "Prod3", "Prod1|Prod2", "Prod1|Prod3", "Prod2|Prod3", "Prod1|Prod2|Prod3",
"Prod1", "Prod4", "Prod1|Prod4",
"Prod3", "Prod5", "Prod3|Prod5",
"Prod2"))
An extension of the canonical answer:
library(dplyr)
df %>%
group_by(ID) %>%
reframe(combo = as.character(do.call(c, lapply(seq_along(prod), \(m) combn(x = prod, m = m, FUN = \(x) paste(x, collapse = "|"))))))
# A tibble: 14 × 2
ID combo
<dbl> <chr>
1 1 Prod1
2 1 Prod2
3 1 Prod3
4 1 Prod1|Prod2
5 1 Prod1|Prod3
6 1 Prod2|Prod3
7 1 Prod1|Prod2|Prod3
8 2 Prod1
9 2 Prod4
10 2 Prod1|Prod4
11 3 Prod3
12 3 Prod5
13 3 Prod3|Prod5
14 4 Prod2
Or in base R:
stack(tapply(df$prod, df$ID,
\(prod) do.call(c, lapply(seq_along(prod), \(m) combn(prod, m, FUN = \(x) paste(x, collapse = "|"))))))[2:1]