pandasdatetimeindexer

TypeError: Cannot compare dtypes int64 and datetime64[ns]


I have datime as df follows.

0    2019-04-09 15:40:00
1    2019-04-09 15:45:00
2    2019-04-09 15:50:00
3    2019-04-09 15:55:00
4    2019-04-09 16:00:00
5    2019-04-09 16:05:00
6    2019-04-09 16:10:00
7    2019-04-09 16:15:00
8    2019-04-09 16:20:00
9    2019-04-09 16:25:00
10   2019-04-09 16:30:00
11   2019-04-09 16:35:00
12   2019-04-09 16:40:00
13   2019-04-09 16:45:00
14   2019-04-09 16:50:00
15   2019-04-09 16:55:00
16   2019-04-09 17:00:00
17   2019-04-09 17:05:00
18   2019-04-09 17:10:00
19   2019-04-09 17:15:00

I tried to get the nearest date.

result = df.index.get_indexer([pd.to_datetime('2019-04-09 17:01:00')], method='nearest')

But it gives the following error.

   match = X.index.get_indexer([dt], method='nearest')
  File "/home/mark/.local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3740, in get_indexer
    return self._get_indexer_non_comparable(target, method=method, unique=True)
  File "/home/mark/.local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 5990, in _get_indexer_non_comparable
    raise TypeError(f"Cannot compare dtypes {self.dtype} and {other.dtype}")
TypeError: Cannot compare dtypes int64 and datetime64[ns]

Solution

  • As BigBen hinted for, get_indexer compares its target argument with the dataframe's index. Therefore, provided the time column was already in datetime format, you simply forgot to set your datetime column as index first.

    df = df.set_index(['DT64'])
    df.head()
    Empty DataFrame
    Columns: []
    Index: [2019-04-09 15:40:00, 2019-04-09 15:45:00, 2019-04-09 15:50:00, 2019-04-09 15:55:00, 2019-04-09 16:00:00]
    

    (yes, that is an empty dataframe, only index exists)

    Now you can get the indexer: (just reproducing your line here)

    ix = df.index.get_indexer([pd.to_datetime('2019-04-09 17:01:00')], method='nearest')
    ix
    array([16], dtype=int64)
    

    That has correctly returned the index of the nearest date.

    df.iloc[ix]
    Empty DataFrame
    Columns: []
    Index: [2019-04-09 17:00:00]