Given a Pandas dataframe of
df = pd.DataFrame({'a': [None,1,None,None,2,None], 'b': [None,None,None,3,None,4]})
a b
0 NaN NaN
1 1.0 NaN
2 NaN NaN
3 NaN 3.0
4 2.0 NaN
5 NaN 4.0
I want to return the position and value of the next non-blank b
after a non-blank a
, and put this position and value into new columns next to the non-blank a
, so:
a b position value
0 NaN NaN NaN NaN
1 1.0 NaN 3 3.0
2 NaN NaN NaN NaN
3 NaN 3.0 NaN NaN
4 2.0 NaN 5 4.0
5 NaN 4.0 NaN NaN
If it makes any difference, the index is a Date_Time
value and the position
result should be the Date_Time index of a
.
There will not be an a
and b
on the same line - they are the minimum and maximum of a (tank level) over time.
The idea is to use bfill
, on column b but also on the index to get the position:
df = pd.DataFrame({'a': [None,1,None,None,2,None], 'b': [None,None,None,3,None,4]})
df = df.reset_index().rename(columns = {'index':'position'})
df.loc[df['b'].isna(),'position'] = None
df.loc[df['a'].notna(),'position'] = df['position'].bfill()
df.loc[df['a'].isna(),'position'] = None
df.loc[df['a'].notna(),'value'] = df['b'].bfill()
df[['a','b', 'position', 'value']]
output
a b position value
-- --- --- ---------- -------
0 nan nan nan nan
1 1 nan 3 3
2 nan nan nan nan
3 nan 3 nan nan
4 2 nan 5 4
5 nan 4 nan nan
More elegant but perhaps slightly less readable. Same idea with bfill
but now using where
:
df = pd.DataFrame({'a': [None,1,None,None,2,None], 'b': [None,None,None,3,None,4]})
df['position'] = df.index.where(df['b'].notna())
df['position'] = df['position'].bfill().where(df['a'].notna())
df['value'] = df['b'].bfill().where(df['a'].notna())