pythonpandasseriesfillgaps-in-data

Fill missing rows based on index number gaps, why does it work? - pandas series


So say i have a pandas series, as:

s = pd.Series([1,2],index=[0,3])

0    1
3    2
dtype: int64

And there's a gap between 0 and 3 in the index, so what i want is to add more rows to fill up the gaps to get the index of [0, 1, 2, 3].

So desired output would look like:

0    1.0
1    NaN
2    NaN
3    2.0
dtype: float64

And i did:

print(s.reindex(range(s.index.min(),s.index.max()+1)))

And it worked!

But why?

I expected a result of:

0    1.0
1    2.0
2    NaN
3    NaN
dtype: float64

But it doesn't, and gives expected one!

(you know, i was ready to create a question about how to do this, but while ready to show an attempt, i solved it :D, so asked a question why did it work :-) , lol )


Solution

  • Intrinsic data alignment. Basically, your source data is aligned with index 0 and 3. When you use reindex, you are creating new rows 1, and 2 and reusing 0 and 3.

    Watch what happens if you do:

    s.reindex([0,0,3,3])
    

    Output:

    0    1
    0    1
    3    2
    3    2
    dtype: int64
    

    Pandas automatically using index alignment.

    Or

    s.reindex([1,2,5,6])
    

    Output:

    1   NaN
    2   NaN
    5   NaN
    6   NaN
    dtype: float64