pandas

Filtering DataFrame by list of substrings


Building off this answer, is there a way to filter a Pandas dataframe by a list of substrings?

Say I want to find all rows where df['menu_item'] contains fresh or spaghetti

Without something like this:

df[df['menu_item'].str.contains('fresh') | (df['menu_item'].str.contains('spaghetti')]


Solution

  • The str.contains method you're using accepts regex, so use the regex | as or:

    df[df['menu_item'].str.contains('fresh|spaghetti')]
    

    Example Input:

              menu_item
    0        fresh fish
    1      fresher fish
    2           lasagna
    3     spaghetti o's
    4  something edible
    

    Example Output:

           menu_item
    0     fresh fish
    1   fresher fish
    3  spaghetti o's