Is there a way to generate combinations of items within a list inside a Polars column without resorting to .map_elements()
+ itertools for each row?
This is my current solution:
import polars as pl
import itertools
(pl.DataFrame({'col': [['a', 'b', 'c']]})
.with_columns(pl.col('col')
.map_elements(lambda list_o_things: [sorted((thing_1, thing_2))
for thing_1, thing_2
in itertools.combinations(list_o_things, 2)])
)
)
which returns this:
[['a', 'b'], ['a', 'c'], ['b', 'c']]
Explode the nested structure, do a cross join with itself, filter out the redundant entries, concat to list, and implode to nested list.
df=pl.DataFrame({'col': [['a', 'b', 'c']]})
(
df
.explode('col')
.join(
df.explode('col'), how='cross')
.filter(pl.col('col')<pl.col('col_right'))
.select(pl.concat_list('col','col_right').implode())
)
shape: (1, 1)
┌──────────────────────────────────────┐
│ col │
│ --- │
│ list[list[str]] │
╞══════════════════════════════════════╡
│ [["a", "b"], ["a", "c"], ["b", "c"]] │
└──────────────────────────────────────┘