I have the dataframe:
df = batch Code
a 100
a 120
a 130
a 120
b 140
b 150
c 100
I want to add a column 'add_code' that will be the value of the column 'Code' from the next row, per batch. So the output will be:
df = batch Code next_code
a 100 120
a 120 130
a 130 120
a 120 END
b 140 150
b 150 END
c 100 END
What is the best way to do it?
Use DataFrameGroupBy.shift
with fill_value
parameter:
df['next_code'] = df.groupby('batch')['Code'].shift(-1, fill_value='END')
print (df)
batch Code next_code
0 a 100 120
1 a 120 130
2 a 130 120
3 a 120 END
4 b 140 150
5 b 150 END
6 c 100 END
Or with Series.fillna
for old pandas versions:
df['next_code'] = df.groupby('batch')['Code'].shift(-1).fillna('END')
print (df)
batch Code next_code
0 a 100 120.0
1 a 120 130.0
2 a 130 120.0
3 a 120 END
4 b 140 150.0
5 b 150 END
6 c 100 END