I want to test that two Polars DataFame objects are equivalent, that contain a column which represents dates.
If I use datetime.date
from the standard library I don't have any problems:
import datetime as dt
import polars as pl
from polars.testing import assert_frame_equal
assert_frame_equal(pl.DataFrame({"foo": [1], "bar": [dt.date(2000, 1, 1)]}), pl.DataFrame({"foo": [1], "bar": [dt.date(2000, 1, 1)]}))
But if I try to use the Date
type from polars the comparison fails, with a PanicException: not implemented
exception.
assert_frame_equal(pl.DataFrame({"foo": [1], "bar": [pl.Date(2000, 1, 1)]}), pl.DataFrame({"foo": [1], "bar": [pl.Date(2000, 1, 1)]}))
Is there a way to use the polars Date
type in the DataFrame
and still be able to compare the two objects?
I don't think you're supposed to use pl.Date
like that, otherwise your DataFrame
is of dtype object
, which is probably not what you wanted:
In [2]: pl.DataFrame({"foo": [1], "bar": [pl.Date(2000, 1, 1)]})
Out[2]:
shape: (1, 2)
┌─────┬─────────────────────────────────────┐
│ foo ┆ bar │
│ --- ┆ --- │
│ i64 ┆ object │
╞═════╪═════════════════════════════════════╡
│ 1 ┆ <polars.datatypes.Date object at... │
└─────┴─────────────────────────────────────┘
Instead, do:
df1 = pl.DataFrame({"foo": [1], "bar": ['2000-01-01']}).with_columns(pl.col('bar').str.to_date())
df2 = pl.DataFrame({"foo": [1], "bar": ['2000-01-01']}).with_columns(pl.col('bar').str.to_date())
assert_frame_equal(df1, df2)
and this works fine