I'm migrating code to polars from pandas. I have time-series data consisting of a timestamp and value column and I need to compute a bunch of features. i.e.
from datetime import datetime, timedelta
df = pl.DataFrame({
"timestamp": pl.datetime_range(
datetime(2017, 1, 1),
datetime(2018, 1, 1),
timedelta(minutes=15),
time_zone="Australia/Sydney",
time_unit="ms", eager=True),
})
value = np.random.normal(0, 1, len(df))
df = df.with_columns([pl.Series(value).alias("value")])
I need to generate a column containing an indicator if the timestamp is standard or daylight time. I'm currently using map_elements
because as far as I can see the isn't a Temporal Expr, i.e. my current code is
def dst(timestamp:datetime):
return int(timestamp.dst().total_seconds()!=0)
df = df.with_columns(pl.struct("timestamp").map_elements(lambda x: dst(**x)).alias("dst"))
(this uses a trick that effectively checks if the tzinfo.dst(dt)
offset is zero or not)
Is there a (fast) way of doing this using polars expressions
rather than (slow) map_elements
?
With polars>=0.18.5
the following works
df = df.with_columns((pl.col("timestamp").dt.dst_offset()==0).cast(pl.Int32).alias("dst"))