Assume there are two columns name date
and maturity_date
in a Polars DataFrame, they are both the type pl.Date
. If use pl.col("maturity_date") - pl.col("date")
we could get the diff days.
But how to get the numdays of the workdays between date
and maturity_date
(not count saturday and sunday)?
Or even calculate diff days that use a given calendar like a trade date calendar of stock market which is different from a normal calendar.
Use pl.concat_list([pl.col("date"), pl.col("maturity_date")]).apply(func)
to call a udf python func could work, but this would be slow.
[update]
data = pl.DataFrame( [ pl.Series("date", [datetime.date(2000, 1, 4), datetime.date(2000, 1, 4), datetime.date(2000, 1, 4), datetime.date(2000, 1, 4), datetime.date(2000, 1, 4)], dtype=pl.Date), pl.Series("maturity_date", [datetime.date(2000, 1, 17), datetime.date(2000, 2, 15), datetime.date(2000, 3, 15), datetime.date(2000, 4, 17), datetime.date(2000, 5, 15)], dtype=pl.Date), ] )
data.with_columns(
pl.date_ranges(e.date, e.maturity_date, closed="both", eager=False)
.list.eval(pl.element().dt.weekday() <= 5)
.list.count_match(True)
.alias("duration")
# pl.concat_list(["date", "maturity_date"]).apply(lambda x: get_work_days(x[0], x[1]))
# .alias("duration")
)
I use this date_ranges to count, but it seems only a little faster than apply
@Dean MacGregor's excellent answer using pure Polars wasn't quite performant enough for my use case.
Using Numpy's built in busday_count
function turned out to be much faster in my case and can easily be converted back to a polars with pl.from_numpy
.
nyse_holidays=pl.Series([
date(2000,1,17),
date(2000,2,21),
date(2000,4,21),
date(2000,5,29),
date(2000,7,4),
date(2000,9,4),
date(2000,11,23),
date(2000,12,25),
])
df = pl.DataFrame(
{
"business_days": pl.from_numpy(np.busday_count(data["date"], data["maturity_date"]),["business_days"])["business_days"],
"trade_days": pl.from_numpy(np.busday_count(data["date"], data["maturity_date"], holidays=nyse_holidays),["trade_days"])["trade_days"],
}
)
shape: (5, 2)
┌───────────────┬────────────┐
│ business_days ┆ trade_days │
│ --- ┆ --- │
│ i64 ┆ i64 │
╞═══════════════╪════════════╡
│ 9 ┆ 9 │
│ 30 ┆ 29 │
│ 51 ┆ 49 │
│ 74 ┆ 72 │
│ 94 ┆ 91 │
└───────────────┴────────────┘