How to find churned customers on a monthly basis? Python Pandas

I have a large customer dataset, it has things like Customer ID, Service ID, Product, etc. So the two ways we can measure churn are at a Customer-ID level, if the entire customer leaves and at a Service-ID level where maybe they cancel 2 out of 5 services.

The data looks like this, and as we can see

Alligators stops being a customer at the end of Jan as they dont have any rows in Feb (CustomerChurn)
Aunties stops being a customer at the end of Jan as they dont have any rows in Feb (CustomerChurn)
Bricks continues with Apples and Oranges in Jan and Feb (ServiceContinue)
Bricks continues being a customer but cancels two services at the end of Jan (ServiceChurn)

I am trying to write some code that creates the 'Churn' column.. I have tried

To manually just grab lists of CustomerIDs and ServiceIDs using Set from Oct 2019, and then comparing that to Nov 2019, to find the ones that churned. This is not too slow but doesn't seem very Pythonic.

Thank you!

data = {'CustomerName': ['Alligators','Aunties', 'Bricks', 'Bricks','Bricks', 'Bricks', 'Bricks', 'Bricks', 'Bricks', 'Bricks'], 
        'ServiceID': [1009, 1008, 1001, 1002, 1003, 1004, 1001, 1002, 1001, 1002], 
        'Product': ['Apples', 'Apples', 'Apples', 'Bananas', 'Oranges', 'Watermelon', 'Apples', 'Bananas', 'Apples', 'Bananas'], 
        'Month': ['Jan', 'Jan', 'Jan', 'Jan', 'Jan', 'Jan', 'Feb', 'Feb', 'Mar', 'Mar'], 
        'Year': [2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021],
        'Churn': ['CustomerChurn', 'CustomerChurn', 'ServiceContinue', 'ServiceContinue', 'ServiceChurn', 'ServiceChurn','ServiceContinue', 'ServiceContinue', 'NA', 'NA']}
df = pd.DataFrame(data)
df

Solution

I think this gets close to what you want, except for the NA in the last two rows, but if you really need those NA, then you can filter by date and change the values.

Because you are really testing two different groupings, I send the first customername grouping through a function and depending what I see, I send a more refined grouping through a second function. For this data set it seems to work.

I create an actual date column and make sure everything is sorted before grouping. The logic inside the functions is testing the max date of the group to see if it's less than a certain date. Looks like you are testing March as the current month

You should be able to adapt it for your needs

df['testdate'] = df.apply(lambda x: datetime.datetime.strptime('-'.join((x['Month'], str(x['Year']))),'%b-%Y'), axis=1)
df = df.sort_values('testdate')
df1 = df.drop('Churn',axis=1)

def get_customerchurn(x, tdate):
    # print(x)
    # print(tdate)
    if x.testdate.max() < tdate:
        x.loc[:, 'Churn'] = 'CustomerChurn'
        return x
    else:
        x = x.groupby(['CustomerName', 'Product']).apply(lambda x: get_servicechurn(x, datetime.datetime(2021,3,1)))
        return x

def get_servicechurn(x, tdate):
    print(x)
    # print(tdate)
    if x.testdate.max() < tdate:
        x.loc[:, 'Churn'] = 'ServiceChurn'
        return x
    else:
        x.loc[:, 'Churn'] = 'ServiceContinue'
        return x

df2 = df1.groupby(['CustomerName']).apply(lambda x: get_customerchurn(x, datetime.datetime(2021,3,1)))
df2

Output:

  CustomerName  ServiceID     Product Month  Year   testdate            Churn
0   Alligators       1009      Apples   Jan  2021 2021-01-01    CustomerChurn
1      Aunties       1008      Apples   Jan  2021 2021-01-01    CustomerChurn
2       Bricks       1001      Apples   Jan  2021 2021-01-01  ServiceContinue
3       Bricks       1002     Bananas   Jan  2021 2021-01-01  ServiceContinue
4       Bricks       1003     Oranges   Jan  2021 2021-01-01     ServiceChurn
5       Bricks       1004  Watermelon   Jan  2021 2021-01-01     ServiceChurn
6       Bricks       1001      Apples   Feb  2021 2021-02-01  ServiceContinue
7       Bricks       1002     Bananas   Feb  2021 2021-02-01  ServiceContinue
8       Bricks       1001      Apples   Mar  2021 2021-03-01  ServiceContinue
9       Bricks       1002     Bananas   Mar  2021 2021-03-01  ServiceContinue