[SOLVED] Pandas groupby make all elements 0 if first element is 1

Pandas groupby make all elements 0 if first element is 1

I have the following df:

| day      | first mover    |
| -------- | -------------- |
| 1        |     1        |
| 2        |     1        |
| 3        |     0        |
| 4        |     0        |
| 5        |     0        |
| 6        |     1        |
| 7        |     0        |
| 8        |     1        |

i want to group this Data frame in the order bottom to top with a frequency of 4 rows. Furthermore if first row of group is 1 make all other entries 0. Desired output:

| day      | first mover    |
| -------- | -------------- |
| 1        |     1        |
| 2        |     0        |
| 3        |     0        |
| 4        |     0        |
| 5        |     0        |
| 6        |     0        |
| 7        |     0        |
| 8        |     0        |

The first half i have accomplished. I am confuse about how to make other entries 0 if first entry in each group is 1.

N=4
(df.iloc[::-1].groupby(np.arange(len(df))//N

Solution

I would use for-loop for this

for name, group in df.groupby(...):

this way I could use if/else to run or skip some code.

To get first element in group:
(I don't know why but .first() doesn't work as I expected - it asks for some offset)

first_value = group.iloc[0]['first mover']

To get indexes of other rows (except first):

group.index[1:]

and use them to set 0 in original df

df.loc[group.index[1:], 'first mover'] = 0

Minimal working code which I used for tests:

import pandas as pd

df = pd.DataFrame({
         'day': [1,2,3,4,5,6,7,8,], 
         'first mover': [1,1,0,0,0,1,0,1]
     })
     
N = 4

for name, group in df.groupby(by=lambda index:index//N):
    #print(f'\n---- group {name} ---\n')
    #print(group)

    first_value = group.iloc[0]['first mover']
    #print('first value:', first_value)
    
    if first_value == 1 :
        #print('>>> change:', group.index[1:])
        df.loc[group.index[1:], 'first mover'] = 0
        
print('\n--- df ---\n')        
print(df)