pythonpython-3.xpandas

How do I split data out from one column of a pandas dataframe into multiple columns of a new dataframe


I would like to split data from this pandas dataframe (let's call it df1):

YEAR   CODE   DIFF
2013   XXXX   5.50
2013   YYYY   8.50
2013   ZZZZ   6.50
2014   XXXX   4.50
2014   YYYY   2.50
2014   ZZZZ   3.50

Such that I create a new dataframe (let's call it df2) that looks like this:

YEAR   XXXX_DIFF   ZZZZ_DIFF
2013   5.50        6.50
2014   4.50        3.50

I guess I'm grouping by year and splitting the single column results found in DIFF into specific CODE matches. I have tried something like this:

df2 = df1[['YEAR','CODE','DIFF']].query('CODE == "XXXX"')

And I know I can rename columns and drop redundant ones but I wasn't sure how to get the ZZZZ DIFF values across to df2?


Solution

  • Using pivot + filter + add_suffix:

    out = (df.pivot(*df).filter(['XXXX','ZZZZ']).add_suffix('_DIFF')
                       .reset_index().rename_axis(None,axis=1))
    

    print(out)
    
       YEAR  XXXX_DIFF  ZZZZ_DIFF
    0  2013        5.5        6.5
    1  2014        4.5        3.5