dataframepython-polars

How to create additional columns weekofyear,month and dayofweek in polars?


I am trying to create additional columns from an existing column name in polars. The existing column name is starttime. This column contains a datetime string.

df = pl.from_repr("""
┌─────────────────────┬─────────────────────┬────────────┬───────────┬─────────┐
│ starttime           ┆ endtime             ┆ storageid  ┆ volume_id ┆ avgiops │
│ ---                 ┆ ---                 ┆ ---        ┆ ---       ┆ ---     │
│ str                 ┆ str                 ┆ str        ┆ str       ┆ i64     │
╞═════════════════════╪═════════════════════╪════════════╪═══════════╪═════════╡
│ 2022-02-10 09:32:20 ┆ 2022-02-10 09:34:28 ┆ TUNYKYPG72 ┆ 4c8d6c31  ┆ 27      │
│ 2022-02-10 10:34:10 ┆ 2022-02-10 10:35:12 ┆ TUNYKYPG42 ┆ 4c8d6d31  ┆ 34      │
└─────────────────────┴─────────────────────┴────────────┴───────────┴─────────┘
""")

From this table , I would like to create additional columns like weekofyear,dayofweek,year etc.

df.with_columns(
    pl.col('starttime').str.to_datetime('%Y').alias('year')
)

But it fails as

InvalidOperationError: conversion from `str` to `date` failed in column 'starttime'

How to create additional columns weekofyear, dayofweek, month in polars?


Solution

  • The first step is to turn the strings into Datetime objects, i.e. .str.to_datetime()

    You can use the methods in the Temporal namespace to extract the pieces of information you want.

    df.with_columns(
       pl.col("starttime").str.to_datetime()
    ).with_columns(
       pl.col("starttime").dt.week().alias("week"),
       pl.col("starttime").dt.weekday().alias("weekday"),
       pl.col("starttime").dt.hour().alias("hour"),
       pl.col("starttime").dt.day().alias("day"),
       pl.col("starttime").dt.iso_year().alias("year")
    )
    
    shape: (2, 10)
    ┌─────────────────────┬─────────────────────┬────────────┬───────────┬─────────┬──────┬─────────┬──────┬─────┬──────┐
    │ starttime           ┆ endtime             ┆ storageid  ┆ volume_id ┆ avgiops ┆ week ┆ weekday ┆ hour ┆ day ┆ year │
    │ ---                 ┆ ---                 ┆ ---        ┆ ---       ┆ ---     ┆ ---  ┆ ---     ┆ ---  ┆ --- ┆ ---  │
    │ datetime[μs]        ┆ str                 ┆ str        ┆ str       ┆ i64     ┆ i8   ┆ i8      ┆ i8   ┆ i8  ┆ i32  │
    ╞═════════════════════╪═════════════════════╪════════════╪═══════════╪═════════╪══════╪═════════╪══════╪═════╪══════╡
    │ 2022-02-10 09:32:20 ┆ 2022-02-10 09:34:28 ┆ TUNYKYPG72 ┆ 4c8d6c31  ┆ 27      ┆ 6    ┆ 4       ┆ 9    ┆ 10  ┆ 2022 │
    │ 2022-02-10 10:34:10 ┆ 2022-02-10 10:35:12 ┆ TUNYKYPG42 ┆ 4c8d6d31  ┆ 34      ┆ 6    ┆ 4       ┆ 10   ┆ 10  ┆ 2022 │
    └─────────────────────┴─────────────────────┴────────────┴───────────┴─────────┴──────┴─────────┴──────┴─────┴──────┘
    ``