I am trying to update specific rows in a python-polars DataFrame where two columns ("Season"
and "Wk"
) meet certain conditions, using values from a list or Series that should align with the filtered rows. In pandas, I would use .loc[]
to do this, but I haven't found a way to achieve the same result with Polars.
import polars as pl
# Sample DataFrame
df = pl.DataFrame({
"Season": [2024, 2024, 2024, 2024],
"Wk": [28, 28, 29, 29],
"position": [1, 2, 3, 4]
})
# List of new values for the filtered rows
new_positions = [10, 20] # Example values aligned with the filtered rows
# Condition to update only where Season is 2024 and Wk is 29
df = df.with_columns(
# Expected update logic here
)
I've tried approaches with when().then()
and zip_with()
, but they either don't accept lists/Series directly or don't align the new values correctly with the filtered rows. Is there a recommended way to update a column conditionally in python-polars with values from a list or Series in the same order as the filter?
You could generate the indices with when/then
and .cum_sum()
df.with_columns(
idx = pl.when(Season=2024, Wk=29).then(1).cum_sum() - 1
)
shape: (4, 4)
┌────────┬─────┬──────────┬──────┐
│ Season ┆ Wk ┆ position ┆ idx │
│ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i32 │
╞════════╪═════╪══════════╪══════╡
│ 2024 ┆ 28 ┆ 1 ┆ null │
│ 2024 ┆ 28 ┆ 2 ┆ null │
│ 2024 ┆ 29 ┆ 3 ┆ 0 │
│ 2024 ┆ 29 ┆ 4 ┆ 1 │
└────────┴─────┴──────────┴──────┘
Which you can .get()
and then .fill_null()
the remaining original values.
df.with_columns(
pl.lit(pl.Series(new_positions))
.get(pl.when(Season=2024, Wk=29).then(1).cum_sum() - 1)
.fill_null(pl.col.position)
.alias("position")
)
shape: (4, 3)
┌────────┬─────┬──────────┐
│ Season ┆ Wk ┆ position │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞════════╪═════╪══════════╡
│ 2024 ┆ 28 ┆ 1 │
│ 2024 ┆ 28 ┆ 2 │
│ 2024 ┆ 29 ┆ 10 │
│ 2024 ┆ 29 ┆ 20 │
└────────┴─────┴──────────┘
Or perhaps .replace_strict()
df.with_columns(
(pl.when(Season=2024, Wk=29).then(1).cum_sum() - 1)
.replace_strict(dict(enumerate(new_positions)), default=pl.col.position)
.alias("position")
)