I have three dataframes:
true false category
36 25 3
40 25 3
46 23 3
40 22 5
42 20 4
56 39 3
50 40 3
44 27 4
51 39 5
54 31 5
50 38 4
I try to calculate for each category a correction factor to correct the "false"-values. So f.e. for category 5:
correction1 = 40/22 = 1.82
correction2 = 51/39 = 1.31
correction3 = 54/31 = 1.74
Then the arithmetic mean of these correction factors is 1.62. So, the result should be a averaged correction factor for each category.
Question: Is there a built-in function in Python/NumPy to calculate this?
You can do this as follows:
(df['true'].div(df['false'])).groupby(df['category']).mean()
This just builds the ratios, then groups them by category
to finally calculate the mean
.
To get the testdata:
from io import StringIO
infile= StringIO(
"""true false category
36 25 3
40 25 3
46 23 3
40 22 5
42 20 4
56 39 3
50 40 3
44 27 4
51 39 5
54 31 5
50 38 4""")
df= pd.read_csv(infile, sep='\s+', dtype='int16')
The result is:
category
3 1.545179
4 1.681806
5 1.622603
dtype: float64