pythonpython-polarsrank

Apply rank with percentile, on python polars, for a set of columns on a dataframe


df = pl.DataFrame(
    {   
        "era": ["01", "01", "02", "02", "03", "03"],
        "pred1": [1, 2, 3, 4, 5,6],
        "pred2": [2,4,5,6,7,8],
        "pred3": [3,5,6,8,9,1],
        "something_else": [5,4,3,67,5,4],
    }
)
pred_cols = ["pred1", "pred2", "pred3"]
ERA_COL = "era"

I'm trying to do an equivalent to pandas rank percentile on Polars. Polars' rank function lacks the pct flag Pandas has.

I looked at another question here: how to replace pandas df.rank(axis=1) with polars

But the results from the question (and applying it to my code), have something off. Calculating rank percentage in Pandas, gives me a single float, the example Polars provided gives me an array, not a float, so something different is being calculated on the example.

As an example, Pandas code is this one:

df[list(pred_cols)] = df.groupby(ERA_COL, group_keys=False).apply(
    lambda d: d[list(pred_cols)].rank(pct=True)
)

Solution

  • You can use the mentioned .rank() / .count() approach with .over()

    df.select(
        (pl.col(pred_cols).rank() / pl.col(pred_cols).count())
           .over(ERA_COL)
    )
    
    shape: (6, 3)
    ┌───────┬───────┬───────┐
    │ pred1 ┆ pred2 ┆ pred3 │
    │ ---   ┆ ---   ┆ ---   │
    │ f64   ┆ f64   ┆ f64   │
    ╞═══════╪═══════╪═══════╡
    │ 0.5   ┆ 0.5   ┆ 0.5   │
    │ 1.0   ┆ 1.0   ┆ 1.0   │
    │ 0.5   ┆ 0.5   ┆ 0.5   │
    │ 1.0   ┆ 1.0   ┆ 1.0   │
    │ 0.5   ┆ 0.5   ┆ 1.0   │
    │ 1.0   ┆ 1.0   ┆ 0.5   │
    └───────┴───────┴───────┘
    

    .with_columns() if you want to "replace" the original values.

    df.with_columns(
        (pl.col(pred_cols).rank() / pl.col(pred_cols).count())
           .over(ERA_COL)
    )
    
    shape: (6, 5)
    ┌─────┬───────┬───────┬───────┬────────────────┐
    │ era ┆ pred1 ┆ pred2 ┆ pred3 ┆ something_else │
    │ --- ┆ ---   ┆ ---   ┆ ---   ┆ ---            │
    │ str ┆ f64   ┆ f64   ┆ f64   ┆ i64            │
    ╞═════╪═══════╪═══════╪═══════╪════════════════╡
    │ 01  ┆ 0.5   ┆ 0.5   ┆ 0.5   ┆ 5              │
    │ 01  ┆ 1.0   ┆ 1.0   ┆ 1.0   ┆ 4              │
    │ 02  ┆ 0.5   ┆ 0.5   ┆ 0.5   ┆ 3              │
    │ 02  ┆ 1.0   ┆ 1.0   ┆ 1.0   ┆ 67             │
    │ 03  ┆ 0.5   ┆ 0.5   ┆ 1.0   ┆ 5              │
    │ 03  ┆ 1.0   ┆ 1.0   ┆ 0.5   ┆ 4              │
    └─────┴───────┴───────┴───────┴────────────────┘