pythonpandasswitch-statementnanmissing-data

Fill nan values if another column value is not in list


I have the following dataframe:

import pandas as pd
import numpy as np
df = pd.DataFrame({ 
     'Name': ['A','B','A','B','A','B','A','B'],
    'Include':[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan],
    'Category':['Cat','Dog','Car','Dog','Bike','Dog','Cat','Bike'],
    })

df

enter image description here

I am trying to fill the Include column with the string yes if the column Category does not match the following list:

exluded = ['Car','Bike']

So that my expected output is this:

enter image description here

Any ideas of how to achieve this?


Solution

  • Use loc and a boolean mask:

    df.loc[~df['Category'].isin(exluded), 'Include'] = 'yes'
    print(df)
    
    # Output
      Name Include Category
    0    A     yes      Cat
    1    B     yes      Dog
    2    A     NaN      Car
    3    B     yes      Dog
    4    A     NaN     Bike
    5    B     yes      Dog
    6    A     yes      Cat
    7    B     NaN     Bike
    

    Alternative with np.where:

    df['Include'] = np.where(df['Category'].isin(exluded), np.nan, 'yes')