[SOLVED] How do I reshape this DataFrame in Python?

How do I reshape this DataFrame in Python?

I have a DataFrame df_sale in Python that I want to reshape, count the sum across the price column and add a new column total.

Here is the df_sale dataframe:

b_no  a_id  price  c_id
120   24     50     2
120   56     100    2
120   90     25     2
120   45     20     2
231   89     55     3
231   45     20     3
231   10     250    3

Excepted output after reshaping:

b_no  a_id_1  a_id_2  a_id_3  a_id_4  total  c_id
120   24      56      90      45      195    2
231   89      45      10      0       325    3

What I have tried so far is use the sum() on df_sale['price'] separately for 120 and 231. I do not understand how should I reshape the data, add new column headers and get the total without being computationally inefficient. Thanks.

Solution

This might not be the cleanest method (at all), but it gets the outcome you want:

reshaped_df = (df.groupby('b_no')[['price', 'c_id']]
               .first()
               .join(df.groupby('b_no')['a_id']
                     .apply(list)
                     .apply(pd.Series)
                     .add_prefix('a_id_'))
               .drop('price',1)
               .join(df.groupby('b_no')['price'].sum().to_frame('total'))
               .fillna(0))


>>> reshaped_df
      c_id  a_id_0  a_id_1  a_id_2  a_id_3  total
b_no                                             
120      2    24.0    56.0    90.0    45.0    195
231      3    89.0    45.0    10.0     0.0    325