pythondataframepython-polars

PyPolars, get value from column based on value in another column without for loop


Using PyPolars I'm trying to create a new column containing the value of a column chosen among several based on a condition.

The condition is expressed in a dictionary. The following code should be clear enough to describe more precisely what I'm looking for.

import polars as pl

def do_that_mapping(lf: pl.LazyFrame) -> pl.LazyFrame:
  my_map = {
    "A": "col_1",
    "B": "col_2",
    "C": "col_2",
    "D": "col_3",
    # many more values to map
  }

  for k, v in my_map.items():
    # ofc I don't want to use that in a loop, I'm looking for a way to execute
    # the following line with a native method of Polars and remove the
    # Python iteration on `my_map`
    lf = lf.with_columns(pl.when(pl.col("col_val") == k).then(pl.col(v)).alias("new_col"))

  return lf


x = pl.LazyFrame(
    data={
        "col_val": ["A", "B", "C", "D"],
        "col_1": [22, 1, 54, 82],
        "col_2": [1, 32, 7, 8],
        "col_3": [4, 6, 90, 3],
    },
    schema={
        "col_val": pl.String,
        "col_1": pl.Int16,
        "col_2": pl.Int16,
        "col_3": pl.Int16,
    },
)
x_with_new_col = x.pipe(do_that_mapping)

Solution

  • One possible approach is to create multiple pl.when().then() expressions and use pl.coalesce() to produce a single column:

    df.with_columns(new_col = 
       pl.coalesce(
          pl.when(pl.col("col_val") == k).then(pl.col(v))
          for k, v in my_map.items()
       )
    )
    
    shape: (4, 5)
    ┌─────────┬───────┬───────┬───────┬─────────┐
    │ col_val ┆ col_1 ┆ col_2 ┆ col_3 ┆ new_col │
    │ ---     ┆ ---   ┆ ---   ┆ ---   ┆ ---     │
    │ str     ┆ i16   ┆ i16   ┆ i16   ┆ i16     │
    ╞═════════╪═══════╪═══════╪═══════╪═════════╡
    │ A       ┆ 22    ┆ 1     ┆ 4     ┆ 22      │
    │ B       ┆ 1     ┆ 32    ┆ 6     ┆ 32      │
    │ C       ┆ 54    ┆ 7     ┆ 90    ┆ 7       │
    │ D       ┆ 82    ┆ 8     ┆ 3     ┆ 3       │
    └─────────┴───────┴───────┴───────┴─────────┘