I am trying to get the indices of the missing date by comparing it to a list of un-missed dates, as the following:
a = pd.DatetimeIndex(["2000", "2001", "2002", "2003",
"2004", "2005", "2009", "2010"])
b = pd.DatetimeIndex(["2000", "2001", "2002", "2003",
"2004", "2005", "2006", "2007",
"2008", "2009", "2010"])
a.reindex(b)
I got the following
(DatetimeIndex(['2000-01-01', '2001-01-01', '2002-01-01', '2003-01-01',
'2004-01-01', '2005-01-01', '2006-01-01', '2007-01-01',
'2008-01-01', '2009-01-01', '2010-01-01'],
dtype='datetime64[ns]', freq=None),
array([ 0, 1, 2, 3, 4, 5, -1, -1, -1, 6, 7]))
I tried to replace all missing value which is -1 to Nan, by using a.reindex(b, fill_value=np.NAN)
but I got the following error TypeError: Index.reindex() got an unexpected keyword argument ‘fill_value’
According the pandas documentation fill_vaue is among the parameters of reindex. Any ideas?
First of all, you have to do:
newIndex, indexer = a.reindex(b)
reindex
returns two things. You need/want to get only the indexer.
So now you can get what you want:
indexerWithNan = np.where(indexer == -1, np.nan, indexer)
Which is:
[ 0. 1. 2. 3. 4. 5. nan nan nan 6. 7.]
Why was your initial code wrong? The reindex()
method does not support the fill_value
parameter for pandas.Index
objects as it does for pandas.Series
or pandas.DataFrame
.