I have a dataframe with some columns of the same type:
['total_tracks', 't_dur0', 't_dur1', 't_dur2', 't_dance0', 't_dance1', 't_dance2',
't_energy0', 't_energy1', 't_energy2', 't_key0', 't_key1', 't_key2', 't_mode0',
't_mode1', 't_mode2', 't_speech0', 't_speech1', 't_speech2', 't_acous0', 't_acous1',
't_acous2', 't_ins0', 't_ins1', 't_ins2', 't_live0', 't_live1', 't_live2', 't_val0',
't_val1', 't_val2', 't_tempo0', 't_tempo1', 't_tempo2', 't_sig0', 't_sig1', 't_sig2',
'popularity', 'release_year', 'release_month']
And I am trying to combine the columns with the same type like this:
# Takes in a dataframe with three columns and returns a dataframe with one column of their means
def average_column(dataframe):
dataframe["mean"] = dataframe.mean(axis=1) # Add column to the dataframe (axis=1 means the mean() is applied row-wise)
mean_df = dataframe.iloc[: , -1:] # Isolated column of the mean by selecting all rows (:) for the last column (-1:)
print("Original: {}\tWith mean:\n{}".format(dataframe, mean_df))
return mean_df
Inspired by this and this question. I tried to run this code:
t_name_df = df[["t_dur0", "t_dur1", "t_dur2"]]
print(t_name_df.columns.tolist())
average_column(t_name_df)
Which gave me this output:
['t_dur0', 't_dur1', 't_dur2']
Original:
t_dur0 t_dur1 t_dur2 mean
0 2315 2310 2293 2306.000000
1 1558 886 1870 1438.000000
2 803 316 504 541.000000
3 498 815 677 663.333333
4 1508 1677 1386 1523.666667
... ... ... ... ...
[2833 rows x 4 columns]
With mean:
mean
0 2306.000000
1 1438.000000
2 541.000000
3 663.333333
4 1523.666667
... ...
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
To get rid of the warning I tried re-writing it:
t_name_df = df.loc['t_dur0', 't_dur0']
print(t_name_df.column.tolist())
average_column(t_name_df)
Which gave me this error:
KeyError: 't_dur0'
How do I get rid of this warning correctly?
Change your average_column
function to this:
def average_column(dataframe):
# ADD THIS LINE:
dataframe = dataframe.copy()
dataframe["mean"] = dataframe.mean(axis=1) # Add column to the dataframe (axis=1 means the mean() is applied row-wise)
mean_df = dataframe.iloc[: , -1:] # Isolated column of the mean by selecting all rows (:) for the last column (-1:)
print("Original: {}\tWith mean:\n{}".format(dataframe, mean_df))
return mean_df
The warning is happening because by doing t_name_df = df[["t_dur0", "t_dur1", "t_dur2"]]
, you're creating a copy of those columns, and pandas is telling you that changes you make to it (t_name_df
) won't reflect in the original dataframe (df
). By adding .copy()
, you explicitly let pandas know that you're okay with that happening.