pythoninterpolationpython-polars

Interpolate based on datetimes


In pandas, I can interpolate based on a datetimes like this:

df1 = pd.DataFrame(
    {
        "ts": [
            datetime(2020, 1, 1),
            datetime(2020, 1, 3, 0, 0, 12),
            datetime(2020, 1, 3, 0, 1, 35),
            datetime(2020, 1, 4),
        ],
        "value": [1, np.nan, np.nan, 3],
    }
)
df1.set_index('ts').interpolate(method='index')

Outputs:

                        value
ts
2020-01-01 00:00:00  1.000000
2020-01-03 00:00:12  2.333426
2020-01-03 00:01:35  2.334066
2020-01-04 00:00:00  3.000000

Is there a similar method in polars? Say, starting with

df1 = pl.DataFrame(
    {
        "ts": [
            datetime(2020, 1, 1),
            datetime(2020, 1, 3, 0, 0, 12),
            datetime(2020, 1, 3, 0, 1, 35),
            datetime(2020, 1, 4),
        ],
        "value": [1, None, None, 3],
    }
)
shape: (4, 2)
┌─────────────────────┬───────┐
│ ts                  ┆ value │
│ ---                 ┆ ---   │
│ datetime[μs]        ┆ i64   │
╞═════════════════════╪═══════╡
│ 2020-01-01 00:00:00 ┆ 1     │
│ 2020-01-03 00:00:12 ┆ null  │
│ 2020-01-03 00:01:35 ┆ null  │
│ 2020-01-04 00:00:00 ┆ 3     │
└─────────────────────┴───────┘

EDIT: I've updated the example to make it a bit more "irregular", so that upsample can't be used as a solution and to make it clear that we need something more generic


Solution

  • Update: Expr.interpolate_by was added in Polars 0.20.28

    df1.with_columns(pl.col("value").interpolate_by("ts"))
    
    shape: (4, 2)
    ┌─────────────────────┬──────────┐
    │ ts                  ┆ value    │
    │ ---                 ┆ ---      │
    │ datetime[μs]        ┆ f64      │
    ╞═════════════════════╪══════════╡
    │ 2020-01-01 00:00:00 ┆ 1.0      │
    │ 2020-01-03 00:00:12 ┆ 2.333426 │
    │ 2020-01-03 00:01:35 ┆ 2.334066 │
    │ 2020-01-04 00:00:00 ┆ 3.0      │
    └─────────────────────┴──────────┘