{'Country': 'USA', 'Age': '52', 'Sal': '12345', 'OnWork': 'No'}
{'Country': 'UK', 'Age': '23', 'Sal': '1142', 'OnWork': 'Yes'}
{'Country': 'MAL', 'Age': '25', 'Sal': '4456', 'OnWork': 'No'}
{'Country': 'MAL', 'Age': '25', 'Sal': '4456', 'OnWork': 'No'}
{'Country': 'MAL', 'Age': '?', 'Sal': '2345', 'OnWork': 'Yes'}
{'Country': 'MAL', 'Age': '25', 'Sal': '3342', 'OnWork': 'Yes'}
{'Country': 'MAL', 'Age': '25', 'Sal': '3452', 'OnWork': 'No'}
{'Country': 'MAL', 'Age': '?', 'Sal': '3562', 'OnWork': 'No'}
Here I have to replace the missing mean value bases on "OnWork" value. Group Yes and its mean value go to Row5 Age. Group NO and its value should go to the Last row.
df = pd.read_csv("Mycal.csv", na_values = missing_values, nrows=50)
df["F8"].fillna(df['F8'].mean(), inplace=True)
df[df["Class"]=="Yes"]["F8"].mean()
I am expecting the Yes values should group and fill Missing value the Mean to fill same for NO. Kindly help me with this
df['Age'] = df['Age'].mask(df['Age'].eq('?'), np.nan).astype(float)
df['Age'] = (df['Age'].fillna(df.groupby('OnWork')['Age'].transform(np.nanmean))
.astype(int))
print(df)
Country Age Sal OnWork
0 USA 52 12345 No
1 UK 23 1142 Yes
2 MAL 25 4456 No
3 MAL 25 4456 No
4 MAL 24 2345 Yes
5 MAL 25 3342 Yes
6 MAL 25 3452 No
7 MAL 31 3562 No
If you want to replace multiple column values at once use:
df = df.fillna(df.groupby('OnWork').transform('mean'))