pythondatetimecastingpython-polarsepoch

Cast multiple columns with Unix epoch to Datetime


I have a dataframe with multiple columns containing unix epochs. In my example I only use 2 of 13 columns I have. I'd like to cast all those columns to a datetime with UTC timezone in a single call to with_columns().

Polars provides the user with pl.from_epoch. However, I didn't find a way to apply it to multiple columns as once.

df = pl.DataFrame({
    "id": [1, 2, 3, 4, 5],
    "start_date": [1566637530, 1561372720, 1561374780, 1558714718, 1558715044],
    "end_date": [1566628686, 1561358079, 1561358135, 1556188225, 1558427697],
    "cancelable": [True, True, False, False, True],
})
id start_date end_date cancelable
i64 i64 i64 bool
----- ------------ ------------ ------------
1 1566637530 1566628686 true
2 1561372720 1561358079 true
3 1561374780 1561358135 false
4 1558714718 1556188225 false
5 1558715044 1558427697 true

but I need:

id start_date end_date cancelable
i64 datetime[s, UTC] datetime[s, UTC] bool
----- ------------------------ ------------------------ ------------
1 2019-08-24T09:05:30+00 2019-08-24T06:38:06+00 true
2 2019-06-24T10:38:40+00 2019-06-24T06:34:39+00 true
3 2019-06-24T11:13:00+00 2019-06-24T06:35:35+00 false
4 2019-05-24T18:18:38+00 2019-04-25T10:30:25+00 false
5 2019-05-24T16:24:04+00 2019-05-21T08:34:57+00 true

So far, my code looks as follows.

columns_epoch_to_timestamp: list[str] = [
    "start_date",
    "end_date",
]

df = df.with_columns(pl.col(*columns_epoch_to_timestamp). )

Solution

  • pl.from_epoch() accepts a single column name, but also Expr objects.

    column: str | Expr | Series | Sequence[int]

    pl.col() can select multiple columns, which we can pass directly:

    df.with_columns(pl.from_epoch(pl.col("start_date", "end_date")))
    
    shape: (5, 4)
    ┌─────┬─────────────────────┬─────────────────────┬────────────┐
    │ id  ┆ start_date          ┆ end_date            ┆ cancelable │
    │ --- ┆ ---                 ┆ ---                 ┆ ---        │
    │ i64 ┆ datetime[μs]        ┆ datetime[μs]        ┆ bool       │
    ╞═════╪═════════════════════╪═════════════════════╪════════════╡
    │ 1   ┆ 2019-08-24 09:05:30 ┆ 2019-08-24 06:38:06 ┆ true       │
    │ 2   ┆ 2019-06-24 10:38:40 ┆ 2019-06-24 06:34:39 ┆ true       │
    │ 3   ┆ 2019-06-24 11:13:00 ┆ 2019-06-24 06:35:35 ┆ false      │
    │ 4   ┆ 2019-05-24 16:18:38 ┆ 2019-04-25 10:30:25 ┆ false      │
    │ 5   ┆ 2019-05-24 16:24:04 ┆ 2019-05-21 08:34:57 ┆ true       │
    └─────┴─────────────────────┴─────────────────────┴────────────┘