pythonpandasstatareshape

Pandas long to wide reshape, by two variables


I have data in long format and am trying to reshape to wide, but there doesn't seem to be a straightforward way to do this using melt/stack/unstack:

Salesman  Height   product      price
  Knut      6        bat          5
  Knut      6        ball         1
  Knut      6        wand         3
  Steve     5        pen          2

Becomes:

Salesman  Height    product_1  price_1  product_2 price_2 product_3 price_3  
  Knut      6        bat          5       ball      1        wand      3
  Steve     5        pen          2        NA       NA        NA       NA

I think Stata can do something like this with the reshape command.


Solution

  • Here's another solution more fleshed out, taken from Chris Albon's site.

    Create "long" dataframe

    raw_data = {
        'patient': [1, 1, 1, 2, 2],
        'obs': [1, 2, 3, 1, 2],
        'treatment': [0, 1, 0, 1, 0],
        'score': [6252, 24243, 2345, 2342, 23525]}
    
    df = pd.DataFrame(raw_data, columns=['patient', 'obs', 'treatment', 'score'])
    
       patient  obs  treatment  score
    0        1    1          0   6252
    1        1    2          1  24243
    2        1    3          0   2345
    3        2    1          1   2342
    4        2    2          0  23525
    

    Make a "wide" data

    df.pivot(index='patient', columns='obs', values='score')
    
    obs           1        2       3
    patient
    1        6252.0  24243.0  2345.0
    2        2342.0  23525.0     NaN