pythonpandas

Filter Pandas dataframe based on combination of two columns


Suppose I have a dataframe as,

   a  b
0  1  2
1  2  3
2  4  2
3  4  3

I want to filter the dataframe such that I get the result as,

   a  b
0  1  2
3  4  3

i.e, I want the combination (1,2) and (4,3) by filtering the two columns together.

If I try this,

df1 = df[df['a'].isin([1,4]) & df['b'].isin([2,3])]

I get the entire dataframe back because combinations of (1,3) and (4,2) also gets included in the above method. But I need only the given combinations. I have a huge list of tuples of two columns based on which I want to filter the dataframe considering the corresponding tuple combination.

Also, I dont want to merge the two columns together as a single string and then filter.


Solution

  • Use -

    df[df[['a', 'b']].apply(tuple, axis=1).isin([(1,2), (4,3)])]
    

    Output

        a   b
    0   1   2
    3   4   3
    

    Explanation

    df[['a', 'b']].apply(tuple, axis=1) gives a series of tuples -

    0    (1, 2)
    1    (2, 3)
    2    (4, 2)
    3    (4, 3)
    

    .isin([(1,2), (4,3)]) searches for the desired tuples and gives a boolean series