According to the manual, pd.to_datetime()
should create a datetime object.
Instead, when I call pd.to_datetime("2012-05-14")
, I get a timestamp object! Calling to_datetime()
on that object finally gives me a datetime object.
In [1]: pd.to_datetime("2012-05-14")
Out[1]: Timestamp('2012-05-14 00:00:00', tz=None)
In [2]: t = pd.to_datetime("2012-05-14")
In [3]: t.to_datetime()
Out[2]: datetime.datetime(2012, 5, 14, 0, 0)
Is there an explanation for this unexpected behaviour?
A Timestamp
object is the way pandas works with datetimes, so it is a datetime object in pandas. But you expected a datetime.datetime
object.
Normally you should not care about this (it is just a matter of a different repr). As long as you are working with pandas, the Timestamp is OK. And even if you really want a datetime.datetime
, most things will work (eg all methods), and otherwise you can use to_pydatetime
to retrieve the datetime.datetime
object.
The longer story:
pandas stores datetimes as data with type datetime64
in index/columns (this are not datetime.datetime
objects). This is the standard numpy type for datetimes and is more performant than using datetime.datetime
objects:
In [15]: df = pd.DataFrame({'A':[dt.datetime(2012,1,1), dt.datetime(2012,1,2)]})
In [16]: df.dtypes
Out[16]:
A datetime64[ns]
dtype: object
In [17]: df.loc[0,'A']
Out[17]: Timestamp('2012-01-01 00:00:00', tz=None)
Timestamp
object. This is a more convenient object to work with the datetimes (more methods, better representation, etc than the datetime64), and this is a subclass of datetime.datetime
, and so has all methods of it.