Is there a similar function in python polars like 'transform' in pandas?

In pandas, I can use transform to generate the codeindex column:

day = ['day1','day2','day3','day4','day1','day2','day3','day1','day2']

code = ["a","a","a","a","b","b","b","c","c"]

price = [1,2,3,4,5,6,7,8,9]

df = pd.DataFrame({"date":day,"code":code,"price":price})

df['codeindex'] = df.groupby('code')['date'].transform(lambda x: range(0, len(x), 1))

How would I generate the codeindex column using Polars?

Expected output:

┌──────┬──────┬───────┬───────────┐
│ date ┆ code ┆ price ┆ codeindex │
│ ---  ┆ ---  ┆ ---   ┆ ---       │
│ str  ┆ str  ┆ i64   ┆ i64       │
╞══════╪══════╪═══════╪═══════════╡
│ day1 ┆ a    ┆ 1     ┆ 0         │
│ day2 ┆ a    ┆ 2     ┆ 1         │
│ day3 ┆ a    ┆ 3     ┆ 2         │
│ day4 ┆ a    ┆ 4     ┆ 3         │
│ day1 ┆ b    ┆ 5     ┆ 0         │
│ day2 ┆ b    ┆ 6     ┆ 1         │
│ day3 ┆ b    ┆ 7     ┆ 2         │
│ day1 ┆ c    ┆ 8     ┆ 0         │
│ day2 ┆ c    ┆ 9     ┆ 1         │
└──────┴──────┴───────┴───────────┘

Solution

You can use window expressions to deal with expressions that only need to be applied within a group.

A windows expression operates on groups you partition by with .over(). It is defined by an expression part like col("date").cum_count() and a partition part defined by .over("code").

If you use an aggregation the result will be broadcasted to match the size of the group.

The code looks like this:

day = ['day1','day2','day3','day4','day1','day2','day3','day1','day2']

code = ["a","a","a","a","b","b","b","c","c"]

price = [1,2,3,4,5,6,7,8,9]


df = pl.DataFrame({"date":day,"code":code,"price":price})

(df.select(
    pl.all(),
    pl.col("date").cum_count().over("code").alias("codeindex"),
))

outputs

shape: (9, 4)
┌──────┬──────┬───────┬───────────┐
│ date ┆ code ┆ price ┆ codeindex │
│ ---  ┆ ---  ┆ ---   ┆ ---       │
│ str  ┆ str  ┆ i64   ┆ u32       │
╞══════╪══════╪═══════╪═══════════╡
│ day1 ┆ a    ┆ 1     ┆ 1         │
│ day2 ┆ a    ┆ 2     ┆ 2         │
│ day3 ┆ a    ┆ 3     ┆ 3         │
│ day4 ┆ a    ┆ 4     ┆ 4         │
│ day1 ┆ b    ┆ 5     ┆ 1         │
│ day2 ┆ b    ┆ 6     ┆ 2         │
│ day3 ┆ b    ┆ 7     ┆ 3         │
│ day1 ┆ c    ┆ 8     ┆ 1         │
│ day2 ┆ c    ┆ 9     ┆ 2         │
└──────┴──────┴───────┴───────────┘