I have a dataframe like this
key1 day feat
0 a 1 None
1 a 2 A
2 a 3 None
3 a 4 A
4 b 1 A
5 b 2 None
6 b 3 None
7 b 4 A
I would like the apply an expanding window with the count function over the feat column but apply the expanding window by sub category based on the key1 column.
eg I want my resultant df to be
key1 day feat count
0 a 1 None 0
1 a 2 A 1
2 a 3 None 1
3 a 4 A 2
4 b 1 A 1
5 b 2 None 1
6 b 3 None 1
7 b 4 A 2
So in this case I would be grouping by key1 and then apply the expanding window to the sub groups so that the count resets for each group. Note that in my actual problem, there are two keys I need to group by not just one.
Lets do groupby
+ cumsum
df['feat'] = df['feat'].replace('None', np.nan) # use only if None is a string
df['count'] = df['feat'].notna().groupby(df['key1']).cumsum()
Result
key1 day feat count
0 a 1 NaN 0
1 a 2 A 1
2 a 3 NaN 1
3 a 4 A 2
4 b 1 A 1
5 b 2 NaN 1
6 b 3 NaN 1
7 b 4 A 2