python-3.xpandaschurn

Calculate monthly churn rate in pandas


This is an example of the data I have in my dataframe:

test = pd.DataFrame({
    'month': [1,2,3,4,5,6,7,8,9],
    'new': [23,45,67,89,12,34,56,90,12],
    'drop': [2,4,7,9,1,4,6,9,1],
})

month   new drop
0   1   23  2
1   2   45  4
2   3   67  7
3   4   89  9
4   5   12  1
5   6   34  4
6   7   56  6
7   8   90  9
8   9   12  1

I need to calculate the monthly churn rate. I need to sum 2 rows in the new column and then divide the value in drop by this sum (in %).

    month 1: 2*100/23
    month 2: 4*100/(23+45-2)
    month 3: 7*100/(23+45+67-2-4)

    etc.

Could anyone, please, suggest an elegant way of doing this?


Solution

  • You need:

    test['drop'].mul(100).div((test['new'].cumsum() - test['drop'].cumsum().shift()).fillna(test['new']))
    

    Output:

    0    8.695652
    1    6.060606
    2    5.426357
    3    4.265403
    4    0.467290
    5    1.619433
    6    2.006689
    7    2.349869
    8    0.259067
    dtype: float64
    

    Explanation:

    (test['new'].cumsum() - test['drop'].cumsum().shift()).fillna(test['new'])
    

    Provides the cumsum of new with subtraction with previous drop cumsum.

    Output (comments added for explanation):

    0     23.0 # 23
    1     66.0 # 23+45-2
    2    129.0 # 23+45+67-2-4
    3    211.0
    4    214.0
    5    247.0
    6    299.0
    7    383.0
    8    386.0