python-polarsrust-polarsnodejs-polars

How to perform a max_by window aggregation in Polars?


I am trying to use polars to do a window aggregate over one value, but map it back to another.

For example, if i wanted to get the name of the max value in a group, instead of (or in combination to) just the max value.

assuming an input of something like this.

df = pl.from_repr("""
┌───────┬──────┬───────┐
│ label ┆ name ┆ value │
│ ---   ┆ ---  ┆ ---   │
│ str   ┆ str  ┆ f64   │
╞═══════╪══════╪═══════╡
│ a.    ┆ foo  ┆ 1.0   │
│ a.    ┆ bar  ┆ 2.0   │
│ b.    ┆ baz  ┆ 1.5   │
│ b.    ┆ boo  ┆ -1.0  │
└───────┴──────┴───────┘
""")
# 'max_by' is not a real method, just using it to express what i'm trying to achieve. 
df.select(pl.col('label'), pl.col('name').max_by('value').over('label'))

i want an output like this

shape: (2, 2)
┌───────┬──────┐
│ label ┆ name │
│ ---   ┆ ---  │
│ str   ┆ str  │
╞═══════╪══════╡
│ a.    ┆ bar  │
│ b.    ┆ baz  │
└───────┴──────┘

ideally with the value too. But i know i can easily add that in via pl.col('value').max().over('label').

shape: (2, 3)
┌───────┬──────┬───────┐
│ label ┆ name ┆ value │
│ ---   ┆ ---  ┆ ---   │
│ str   ┆ str  ┆ f64   │
╞═══════╪══════╪═══════╡
│ a.    ┆ bar  ┆ 2.0   │
│ b.    ┆ baz  ┆ 1.5   │
└───────┴──────┴───────┘

Solution

  • You were close. There is a sort_by expression that can be used.

    df.group_by('label').agg(pl.all().sort_by('value').last())
    
    shape: (2, 3)
    ┌───────┬──────┬───────┐
    │ label ┆ name ┆ value │
    │ ---   ┆ ---  ┆ ---   │
    │ str   ┆ str  ┆ f64   │
    ╞═══════╪══════╪═══════╡
    │ a.    ┆ bar  ┆ 2.0   │
    │ b.    ┆ baz  ┆ 1.5   │
    └───────┴──────┴───────┘
    

    If you need a windowed version of this:

    df.with_columns(
        pl.col('name','value').sort_by('value').last().over('label').name.suffix("_max")
    )
    
    shape: (4, 5)
    ┌───────┬──────┬───────┬──────────┬───────────┐
    │ label ┆ name ┆ value ┆ name_max ┆ value_max │
    │ ---   ┆ ---  ┆ ---   ┆ ---      ┆ ---       │
    │ str   ┆ str  ┆ f64   ┆ str      ┆ f64       │
    ╞═══════╪══════╪═══════╪══════════╪═══════════╡
    │ a.    ┆ foo  ┆ 1.0   ┆ bar      ┆ 2.0       │
    │ a.    ┆ bar  ┆ 2.0   ┆ bar      ┆ 2.0       │
    │ b.    ┆ baz  ┆ 1.5   ┆ baz      ┆ 1.5       │
    │ b.    ┆ boo  ┆ -1.0  ┆ baz      ┆ 1.5       │
    └───────┴──────┴───────┴──────────┴───────────┘