There is a DataFrame with option tickers at the zero level, prices (open, close, high, and low) at the first level, and option types at the second level (header structure: 'Ticker', 'open/close/high/low', 'Type'). The task: for each ticker, calculate the average price, then add columns with the average prices (header structure: 'Ticker', 'Average', 'Type') to the original DataFrame. My script executes almost correctly, but it corrupts the column names of the original DataFrame.
Code:
import pandas as pd
import numpy as np
df_f_avg = pd.DataFrame({
('a1c9', 'open', 'call'): [3, 0.50, 3.0],
('a1c9', 'close', 'call'): [4, 0.35, 3.5],
('a1c9', 'high', 'call'): [1, 0.20, 1.5],
('a1c9', 'low', 'call'): [2, 0.09, 2.5],
('a1p9', 'open', 'put'): [5, 0.05, 4.0],
('a1p9', 'close', 'put'): [6, 0.10, 5.5],
('a1p9', 'high', 'put'): [7, 0.20, 6.0],
('a1p9', 'low', 'put'): [8, 0.31, 7.0]})
df_f_avg.columns.names = ['Ticker', 'Price', 'Type']
df_f_avg
columns_name = df_f_avg.loc[:, pd.IndexSlice[:, 'open']].columns
avg_group_ticker = pd.DataFrame(data=df_f_avg.loc[:, (slice(None), ['open', 'close', 'high', 'low'], slice(None))].T.groupby('Ticker').mean().T)
avg_group_ticker.columns = columns_name
avg_group_ticker.columns._levels[1].values[:] = 'avg'
df_f_avg = pd.concat([df_f_avg, avg_group_ticker], axis=1)
df_f_avg
It turns out like this: …
Ticker a1c9 a1p9 a1c9 a1p9
Price avg avg avg avg
Type call call call call put put put put call put
0 3.0 4.00 1.0 2.00 5.00 6.0 7.0 8.00 2.500 6.500
1 0.5 0.35 0.2 0.09 0.05 0.1 0.2 0.31 0.285 0.165
2 3.0 3.50 1.5 2.50 4.00 5.5 6.0 7.00 2.625 5.625
But it should look like this: …
Ticker a1c9 a1p9 a1c9 a1p9
Price open close high low open close high low avg avg
Type call call call call put put put put call put
0 3.0 4.00 1.0 2.00 5.00 6.0 7.0 8.00 2.500 6.500
1 0.5 0.35 0.2 0.09 0.05 0.1 0.2 0.31 0.285 0.165
2 3.0 3.50 1.5 2.50 4.00 5.5 6.0 7.00 2.625 5.625
Questions:
The issue is that you modify the original index since columns_name = df_f_avg.loc[:, pd.IndexSlice[:, 'open']].columns
is making a reference to it.
You must make a deep copy:
columns_name = df_f_avg.loc[:, pd.IndexSlice[:, 'open']].columns.copy(deep=True)
You could also rename
Price and groupby
all levels:
out = df_f_avg.join(df_f_avg.rename(columns=lambda x: 'avg', level='Price')
.T.groupby(['Ticker', 'Price', 'Type']).mean().T
)
Output:
Ticker a1c9 a1p9 a1c9 a1p9
Price open close high low open close high low avg avg
Type call call call call put put put put call put
0 3.0 4.00 1.0 2.00 5.00 6.0 7.0 8.00 2.500 6.500
1 0.5 0.35 0.2 0.09 0.05 0.1 0.2 0.31 0.285 0.165
2 3.0 3.50 1.5 2.50 4.00 5.5 6.0 7.00 2.625 5.625