If a DataFrame is constructed like the following using polars-python:
import polars as pl
from polars import col
from datetime import datetime
df = pl.DataFrame({
"dates": ["2016-07-02", "2016-08-10", "2016-08-31", "2016-09-10"],
"values": [1, 2, 3, 4]
})
How to select the rows between a certain date range, i.e. between between "2016-08-10"
and "2016-08-31"
, so that the desired outcome is:
┌────────────┬────────┐
│ dates ┆ values │
│ --- ┆ --- │
│ date ┆ i64 │
╞════════════╪════════╡
│ 2016-08-10 ┆ 2 │
│ 2016-08-31 ┆ 3 │
└────────────┴────────┘
First you need transform the string values to date types then filter:
# eager
(df.with_columns(pl.col("dates").str.to_date())
.filter(col("dates").is_between(datetime(2016, 8, 9), datetime(2016, 9, 1)))
)
# lazy
(df.lazy()
.with_columns(pl.col("dates").str.to_date())
.filter(col("dates").is_between(datetime(2016, 8, 9), datetime(2016, 9, 1)))
.collect()
)
both result in the desired output:
┌────────────┬────────┐
│ dates ┆ values │
│ --- ┆ --- │
│ date ┆ i64 │
╞════════════╪════════╡
│ 2016-08-10 ┆ 2 │
│ 2016-08-31 ┆ 3 │
└────────────┴────────┘