pythonpandasdataframefor-loop

How to replace nested for loops with apply from dataframe in python


I have a dataframe and a list below.

import pandas as pd

my_df = pd.DataFrame({'fruits': ['apple', 'banana', 'cherry', 'durian'], 
                      'check': [False, False, False, False]})
my_list = ['pp', 'ana', 'ra', 'cj', 'up', 'down', 'pri']

>>> my_df
   fruits     check
0     apple     False
1    banana     False
2    cherry     False
3    durian     False

I can make a result with nested for loops.

for fruit in my_df['fruits']:
    for v in my_list:
        if v in fruit:
            my_df.loc[my_df['fruits']==fruit, 'check'] = True

>>> my_df
   fruits     check
0     apple     True
1    banana     True
2    cherry     False
3    durian     False

I tried below.

my_df['fruits'].apply(lambda x: True for i in my_list if i in x)

but, it spat out Type Error: 'generator' object is not cllable

I want to remove nested for loops and replace these with apply function. How can i do this?


Solution

  • This line should be corrected.

    my_df['fruits'].apply(lambda x: True for i in my_list if i in x)
    

    The lambda function returns a generator, not a boolean (True/False value). On the other hand, the apply() method expects a return value for each row of the DataFrame, and this return value should be a Boolean (or any other type), not a generator object. Additionally, missing any() to Check for Matches.

    The corrected line is:

    my_df['check'] = my_df['fruits'].apply(lambda x: any(v in x for v in my_list))
    

    Output:

    enter image description here

    Sources:

    https://docs.python.org/3/tutorial/controlflow.html#lambda-expressions

    https://docs.python.org/3/tutorial/classes.html#generator-expressions

    https://docs.python.org/3/library/functions.html#any