I have a polars dataframe with two date columns that represent a start and end date and then a value that I want to repeat for all dates in between those two dates so that I can join those on other tables.
Example input is
df = pl.from_repr("""
┌─────┬─────────────────────┬─────────────────────┬───────┐
│ id ┆ start ┆ end ┆ value │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ datetime[ns] ┆ datetime[ns] ┆ i64 │
╞═════╪═════════════════════╪═════════════════════╪═══════╡
│ 123 ┆ 2022-01-01 00:00:00 ┆ 2022-01-04 00:00:00 ┆ 10 │
│ abc ┆ 2022-03-04 00:00:00 ┆ 2022-03-04 00:00:00 ┆ 3 │
│ 456 ┆ 2022-05-11 00:00:00 ┆ 2022-05-16 00:00:00 ┆ 4 │
└─────┴─────────────────────┴─────────────────────┴───────┘
""")
and expected output is
shape: (11, 3)
┌─────┬─────────────────────┬───────┐
│ id ┆ date ┆ value │
│ --- ┆ --- ┆ --- │
│ str ┆ datetime[ns] ┆ i64 │
╞═════╪═════════════════════╪═══════╡
│ 123 ┆ 2022-01-01 00:00:00 ┆ 10 │
│ 123 ┆ 2022-01-02 00:00:00 ┆ 10 │
│ 123 ┆ 2022-01-03 00:00:00 ┆ 10 │
│ 123 ┆ 2022-01-04 00:00:00 ┆ 10 │
│ abc ┆ 2022-03-04 00:00:00 ┆ 3 │
│ 456 ┆ 2022-05-11 00:00:00 ┆ 4 │
│ 456 ┆ 2022-05-12 00:00:00 ┆ 4 │
│ 456 ┆ 2022-05-13 00:00:00 ┆ 4 │
│ 456 ┆ 2022-05-14 00:00:00 ┆ 4 │
│ 456 ┆ 2022-05-15 00:00:00 ┆ 4 │
│ 456 ┆ 2022-05-16 00:00:00 ┆ 4 │
└─────┴─────────────────────┴───────┘
I struggled today with the same problem and I thought I share my solution.
As cbilot already mentions pl.date_range doesn't take expressions as low and high value. So I worked around by using apply.
Data:
import polars as pl
from datetime import date
df = pl.DataFrame(
{
"id": ["123", "abc", "456"],
"start": [date(2022, 1, 1), date(2022, 3, 4), date(2022, 5, 11)],
"end": [date(2022, 1, 4), date(2022, 3, 4), date(2022, 5, 16)],
"value": [10, 3, 4],
}
)
Solution:
(
df.with_columns(
pl.struct("start", "end")
.map_elements(lambda x: pl.date_range(x["start"], x["end"], "1d", eager=True))
.alias("date"))
.explode("date")
.select("id", "date", "value")
)
shape: (11, 3)
┌─────┬────────────┬───────┐
│ id ┆ date ┆ value │
│ --- ┆ --- ┆ --- │
│ str ┆ date ┆ i64 │
╞═════╪════════════╪═══════╡
│ 123 ┆ 2022-01-01 ┆ 10 │
│ 123 ┆ 2022-01-02 ┆ 10 │
│ 123 ┆ 2022-01-03 ┆ 10 │
│ 123 ┆ 2022-01-04 ┆ 10 │
│ abc ┆ 2022-03-04 ┆ 3 │
│ 456 ┆ 2022-05-11 ┆ 4 │
│ 456 ┆ 2022-05-12 ┆ 4 │
│ 456 ┆ 2022-05-13 ┆ 4 │
│ 456 ┆ 2022-05-14 ┆ 4 │
│ 456 ┆ 2022-05-15 ┆ 4 │
│ 456 ┆ 2022-05-16 ┆ 4 │
└─────┴────────────┴───────┘