pythondataframedatetimeinterpolationpython-polars

Interpolate values in a Polars column based on another datetime column


In pandas, I can interpolate based on a datetimes like this:

import polars as pl
from datetime import datetime

df1 = pl.DataFrame(
    {
        "ts": [
            datetime(2020, 1, 1),
            datetime(2020, 1, 3, 0, 0, 12),
            datetime(2020, 1, 3, 0, 1, 35),
            datetime(2020, 1, 4),
        ],
        "value": [1, None, None, 3],
    }
)
df1.to_pandas().set_index('ts').interpolate(method='index')

Outputs:

                        value
ts
2020-01-01 00:00:00  1.000000
2020-01-03 00:00:12  2.333426
2020-01-03 00:01:35  2.334066
2020-01-04 00:00:00  3.000000

Is there a similar method in polars?


Solution

  • Update: Expr.interpolate_by was added in Polars 0.20.28

    df1.with_columns(pl.col("value").interpolate_by("ts"))
    
    shape: (4, 2)
    ┌─────────────────────┬──────────┐
    │ ts                  ┆ value    │
    │ ---                 ┆ ---      │
    │ datetime[μs]        ┆ f64      │
    ╞═════════════════════╪══════════╡
    │ 2020-01-01 00:00:00 ┆ 1.0      │
    │ 2020-01-03 00:00:12 ┆ 2.333426 │
    │ 2020-01-03 00:01:35 ┆ 2.334066 │
    │ 2020-01-04 00:00:00 ┆ 3.0      │
    └─────────────────────┴──────────┘