[SOLVED] How to filter and drop rows in a pandas dataframe from selected columns

How to filter and drop rows in a pandas dataframe from selected columns

I've got a pandas dataframe that has 100 columns. First col is a string and the rest are ints. I'd like to drop any row where an int column contains a value that is out of range (less than 0 or greater than 500)

I've seen examples of doing this operation on single columns but not multiple. I've tried the following where I've filled a list with all 99 column names that are ints but the index list returned contains the index of every row so they all get dropped.

drop_list = my_data[(my_data[column_name_list] < 0) | (my_data[column_name_list] > 500)].index
my_data = my_data.drop(drop_list)

Solution

If I understand correctly, you should aggregate with any/all to perform boolean indexing:

out = my_data[~(my_data[column_name_list].lt(0).any(axis=1)
               |my_data[column_name_list].gt(500).any(axis=1)
               )]

Example:

# input
   A  0   1    2    3
0  a -1  10  100   10
1  b  2  20  200   20
2  c  3  30  300  600

# output
   A  0   1    2   3
1  b  2  20  200  20

# intermediates
   A  0   1    2    3  lt(0).any(axis=1)  gt(500).any(axis=1)     OR      ~
0  a -1  10  100   10               True                False   True  False
1  b  2  20  200   20              False                False  False   True
2  c  3  30  300  600              False                 True   True  False

Note that following De Morgan's law this is equivalent to:

out = my_data[(my_data[column_name_list].ge(0).all(axis=1)
             & my_data[column_name_list].le(500).all(axis=1)
              )]

Intermediates:

   A  0   1    2    3  ge(0).all(axis=1)  le(500).all(axis=1)    AND
0  a -1  10  100   10              False                 True  False
1  b  2  20  200   20               True                 True   True
2  c  3  30  300  600               True                False  False