Using PyPolars I'm trying to create a new column containing the value of a column chosen among several based on a condition.
The condition is expressed in a dictionary. The following code should be clear enough to describe more precisely what I'm looking for.
import polars as pl
def do_that_mapping(lf: pl.LazyFrame) -> pl.LazyFrame:
my_map = {
"A": "col_1",
"B": "col_2",
"C": "col_2",
"D": "col_3",
# many more values to map
}
for k, v in my_map.items():
# ofc I don't want to use that in a loop, I'm looking for a way to execute
# the following line with a native method of Polars and remove the
# Python iteration on `my_map`
lf = lf.with_columns(pl.when(pl.col("col_val") == k).then(pl.col(v)).alias("new_col"))
return lf
x = pl.LazyFrame(
data={
"col_val": ["A", "B", "C", "D"],
"col_1": [22, 1, 54, 82],
"col_2": [1, 32, 7, 8],
"col_3": [4, 6, 90, 3],
},
schema={
"col_val": pl.String,
"col_1": pl.Int16,
"col_2": pl.Int16,
"col_3": pl.Int16,
},
)
x_with_new_col = x.pipe(do_that_mapping)
One possible approach is to create multiple pl.when().then()
expressions and use pl.coalesce()
to produce a single column:
df.with_columns(new_col =
pl.coalesce(
pl.when(pl.col("col_val") == k).then(pl.col(v))
for k, v in my_map.items()
)
)
shape: (4, 5)
┌─────────┬───────┬───────┬───────┬─────────┐
│ col_val ┆ col_1 ┆ col_2 ┆ col_3 ┆ new_col │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i16 ┆ i16 ┆ i16 ┆ i16 │
╞═════════╪═══════╪═══════╪═══════╪═════════╡
│ A ┆ 22 ┆ 1 ┆ 4 ┆ 22 │
│ B ┆ 1 ┆ 32 ┆ 6 ┆ 32 │
│ C ┆ 54 ┆ 7 ┆ 90 ┆ 7 │
│ D ┆ 82 ┆ 8 ┆ 3 ┆ 3 │
└─────────┴───────┴───────┴───────┴─────────┘