I have an array which represents object states, where 0 - object is off, and 1 - object is on.
import pandas as pd
import numpy as np
s = [np.nan, 0, np.nan, np.nan, 1, np.nan, np.nan, 0, np.nan, 1, np.nan]
df = pd.DataFrame(s, columns=["s"])
df
s
0 NaN
1 0.0
2 NaN
3 NaN
4 1.0
5 NaN
6 NaN
7 0.0
8 NaN
9 1.0
10 NaN
I need to forward will only 0-values in it, like below.
>>> df_wanted
s
0 NaN
1 0.0
2 0.0
3 0.0
4 1.0
5 NaN
6 NaN
7 0.0
8 0.0
9 1.0
10 NaN
After browsing similar queations here, I just compare ffill
-ed and bfill
-ed values and assign back with a mask:
mask = (df.ffill() == 0) & (df.bfill() == 1)
df[mask] = 0
df
s
0 NaN
1 0.0
2 0.0
3 0.0
4 1.0
5 NaN
6 NaN
7 0.0
8 0.0
9 1.0
10 NaN
But it won't help if any 0 value is not followed by 1. What could be more elegant solution that takes such cases into account?
mask = (df.ffill() == 0)
should only be suffice to fulfill your usecase.
Firstly, df.ffill
will propagate the last valid observation forward. So rows followed by 0
will be filled by 0s
, and rows followed by 1
will be filled by 1s
. Compare that to 0
to select rows with 0s
only and use it as mask to get your final df.
Example: (Added a 0 and few NaNs to the end of your df)
>>> s = [np.nan, 0, np.nan, np.nan, 1, np.nan, np.nan, 0, np.nan, 1, np.nan, np.nan, 0, np.nan, np.nan, np.nan]
>>> df = pd.DataFrame(s, columns=["s"])
>>> df
s
0 NaN
1 0.0
2 NaN
3 NaN
4 1.0
5 NaN
6 NaN
7 0.0
8 NaN
9 1.0
10 NaN
11 NaN
12 0.0
13 NaN
14 NaN
15 NaN
>>>
>>>
>>> df[df.ffill() == 0] = 0
>>> df
s
0 NaN
1 0.0
2 0.0
3 0.0
4 1.0
5 NaN
6 NaN
7 0.0
8 0.0
9 1.0
10 NaN
11 NaN
12 0.0
13 0.0
14 0.0
15 0.0