pythonpandasdataframepandas-apply

Pandas df.apply function returns None


What I'm trying to do: Pass a column through a regex search in order to return that will be added to another column

How: By writing a function with simple if-else clauses:

def category(series):
    pattern = 'microsoft|office|m365|o365'
    if re.search (series,pattern,re.IGNORECASE) != None:
        return 'Microsoft 365'
    else:
        return 'Not Microsoft 365'

df['Category'] = df['name'].apply(category)

Expected Output: A series with values set to Microsoft 365 or Not Microsoft 365

Actual Output: A series with None values

How I've solved it currently:

df[df['name'].str.contains(pattern,case = False), 'Category'] = 'Microsoft 365'

A snippet of the dataset:

name Category
Microsoft None
M365 None

I am trying to understand why the apply function did not work. Any insights will be appreciated. I'm fairly new to Pandas so not 100% what's going wrong.

Thank you!


Solution

  • This should work:

    import pandas as pd
    import re
    
    df = pd.DataFrame({
        'name': ['Microsoft Exchange Pro', 'Microsoft', 'microsoft', 'office', 'Office', 'M365', 'm365', 'other'], 
        'Category':[None, None, None, None, None, None, None, None]
    })
    
    def category(series):
        pattern = 'microsoft|office|m365|o365'
        if re.search (pattern, series, re.IGNORECASE) != None:
            return 'Microsoft 365'
        else:
            return 'Not Microsoft 365'
    
    df['Category'] = df['name'].apply(category)
    
    print(df)
    

    Result:

                         name           Category
    0  Microsoft Exchange Pro      Microsoft 365
    1               Microsoft      Microsoft 365
    2               microsoft      Microsoft 365
    3                  office      Microsoft 365
    4                  Office      Microsoft 365
    5                    M365      Microsoft 365
    6                    m365      Microsoft 365
    7                   other  Not Microsoft 365