pandasdataframedata-sciencedata-munging

Pandas add a column of value of next row in another column (per group)


I have the dataframe:

df = batch Code
      a     100
      a     120
      a     130
      a     120 
      b     140
      b     150
      c     100

I want to add a column 'add_code' that will be the value of the column 'Code' from the next row, per batch. So the output will be:

df = batch Code next_code
      a     100    120
      a     120    130
      a     130    120
      a     120    END
      b     140    150
      b     150    END
      c     100    END

What is the best way to do it?


Solution

  • Use DataFrameGroupBy.shift with fill_value parameter:

    df['next_code'] = df.groupby('batch')['Code'].shift(-1, fill_value='END')
    print (df)
      batch  Code next_code
    0     a   100       120
    1     a   120       130
    2     a   130       120
    3     a   120       END
    4     b   140       150
    5     b   150       END
    6     c   100       END
    

    Or with Series.fillna for old pandas versions:

    df['next_code'] = df.groupby('batch')['Code'].shift(-1).fillna('END')
    print (df)
      batch  Code next_code
    0     a   100     120.0
    1     a   120     130.0
    2     a   130     120.0
    3     a   120       END
    4     b   140     150.0
    5     b   150       END
    6     c   100       END