I have some duration type data (lap times) as pl.String
that fails to convert using strptime
, whereas regular datetimes work as expected.
Minutes (before :) and Seconds (before .) are always padded to two digits, Milliseconds are always 3 digits.
Lap times are always < 2 min.
df = pl.DataFrame({
"lap_time": ["01:14.007", "00:53.040", "01:00.123"]
})
df = df.with_columns(
# pl.col('release_date').str.to_date("%B %d, %Y"), # works
pl.col('lap_time').str.to_time("%M:%S.%3f").cast(pl.Duration), # fails
)
So I used the chrono
format specifier definitions from https://docs.rs/chrono/latest/chrono/format/strftime/index.html which are used as per the polars
docs of strptime
the second conversion (for lap_time
) always fails, no matter whether I use .%f
, .%3f
, %.3f
. Apparently, strptime
doesn't allow creating a pl.Duration
directly, so I tried with pl.Time
but it fails with error:
ComputeError: strict conversion to dates failed, maybe set strict=False
but setting strict=False
yields all null
values for the whole Series.
Am I missing something or this some weird behavior on chrono
's or python-polars
part?
In case you have duration that may exceed 24 hours, you can extract data (minutes, seconds and so on) from string using regex pattern. For example:
df = pl.DataFrame({
"time": ["+01:14.007", "100:20.000", "-05:00.000"]
})
df.with_columns(
pl.col("time").str.extract_all(r"([+-]?\d+)")
# /
# you will get array of length 3
# ["min", "sec", "ms"]
).with_columns(
pl.duration(
minutes=pl.col("time").list.get(0),
seconds=pl.col("time").list.get(1),
milliseconds=pl.col("time").list.get(2)
).alias("time")
)
┌──────────────┐
│ time │
│ --- │
│ duration[ns] │
╞══════════════╡
│ 1m 14s 7ms │
│ 1h 40m 20s │
│ -5m │
└──────────────┘
pl.Time
To convert data to pl.Time
, you need to specify hours as well. When you add 00
hours to your time, code will work:
df = pl.DataFrame({"str_time": ["01:14.007", "01:18.880"]})
df.with_columns(
duration = (pl.lit("00:") + pl.col("str_time"))
.str.to_time("%T%.3f")
.cast(pl.Duration)
)
┌───────────┬──────────────┐
│ str_time ┆ duration │
│ --- ┆ --- │
│ str ┆ duration[μs] │
╞═══════════╪══════════════╡
│ 01:14.007 ┆ 1m 14s 7ms │
│ 01:18.880 ┆ 1m 18s 880ms │
└───────────┴──────────────┘