pythonpython-3.xnumpydataframeerror-correction

Calculate correction factor in Python


I have three dataframes:

true false category 
36   25    3
40   25    3
46   23    3
40   22    5
42   20    4
56   39    3
50   40    3
44   27    4
51   39    5
54   31    5
50   38    4

I try to calculate for each category a correction factor to correct the "false"-values. So f.e. for category 5:

correction1 = 40/22 = 1.82
correction2 = 51/39 = 1.31
correction3 = 54/31 = 1.74

Then the arithmetic mean of these correction factors is 1.62. So, the result should be a averaged correction factor for each category.

Question: Is there a built-in function in Python/NumPy to calculate this?


Solution

  • You can do this as follows:

    (df['true'].div(df['false'])).groupby(df['category']).mean()
    

    This just builds the ratios, then groups them by category to finally calculate the mean.

    To get the testdata:

    from io import StringIO
    
    infile= StringIO(
    """true false category 
    36   25    3
    40   25    3
    46   23    3
    40   22    5
    42   20    4
    56   39    3
    50   40    3
    44   27    4
    51   39    5
    54   31    5
    50   38    4""")
    df= pd.read_csv(infile, sep='\s+', dtype='int16')
    

    The result is:

    category
    3    1.545179
    4    1.681806
    5    1.622603
    dtype: float64