pythonpandasdataframefilteringisin

Exclude/Filter values from dataframe with function .isin() in Pandas


I'm working on a Pandas dataframe with transactional data (customer purchases) and want to exclude rows with certain customer numbers contained in a column 'CUSTOMER_ID'.

To achieve this, I created a list with the customer numbers to be exluded: excluded_customers = ['2000', '2100', '3100', '4000', '4100', '4200', '4300', '4400', '4700', '6802']

Then I used the .isin() function to filter my df accordingly and save it in a new df2: df2 = df[(df['CUSTOMER_ID'].isin(excluded_customers) == False)]

Then I want to sort the new df2 by column 'CUSTOMER_ID' in ascending order. However, the excluded customer numbers still appear in the dataframe: df2.sort_values(by=['CUSTOMER_ID'])

I would much appreciate some hints why they aren't dropped from the df.

Thank you!


Solution

  • Convert column to strings and for invert mask use ~:

    df2 = (df[~df['CUSTOMER_ID'].astype(str).isin(excluded_customers)]
              .sort_values(by=['CUSTOMER_ID']))