I have a multiindex
dataframe
, which comes from groupby
.
Here is a demo:
In [54]: df = pd.DataFrame({'color': ['blue', 'grey', 'blue', 'grey', 'black'], 'name': ['pen', 'pen', 'pencil', 'pencil', 'box'],'price':[2.5, 2.3, 1.5, 1.3, 5.2],'bprice':[2.2, 2, 1.3, 1.2, 5.0]})
In [55]: df
Out[55]:
color name price bprice
0 blue pen 2.5 2.2
1 grey pen 2.3 2.0
2 blue pencil 1.5 1.3
3 grey pencil 1.3 1.2
4 black box 5.2 5.0
In [56]: a = df.groupby(['color', 'name'])[['price', 'bprice']].sum()
In [57]: a
Out[57]:
price bprice
color name
black box 5.2 5.0
blue pen 2.5 2.2
pencil 1.5 1.3
grey pen 2.3 2.0
pencil 1.3 1.2
I want to add a row in every color index, the ideal output is:
price bprice
color name
black * 5.2 5.0
box 5.2 5.0
blue * 4.0 3.5
pen 2.5 2.2
pencil 1.5 1.3
grey * 3.6 3.2
pen 2.3 2.0
pencil 1.3 1.2
There are two requirements:
*
row should be in the first row of each group*
row, the other row should be sorted by price
I tried a lot of methods, but not find a elegant method. Insert a row into a multiindex
dataframe
with specified position seems be hard.
could you help on this?
Compute a groupby.sum
on a
, then append a level with *
and concat
, finally sort_index
based on color
:
# compute the sum per color/name
# sort by descending price
a = (df.groupby(['color', 'name'])[['price', 'bprice']].sum()
.sort_values(by='price', ascending=False)
)
# compute the sum per color
# concatenate, sort_index in a stable way
out = (pd.concat([a.groupby('color').sum()
.assign(name='*')
.set_index('name', append=True),
a])
.sort_index(level='color', kind='stable',
sort_remaining=False)
)
Output:
price bprice
color name
black * 5.2 5.0
box 5.2 5.0
blue * 4.0 3.5
pen 2.5 2.2
pencil 1.5 1.3
grey * 3.6 3.2
pen 2.3 2.0
pencil 1.3 1.2