Here is my dataframe
my_df = pd.DataFrame({'col_1': ['A', 'A', 'B', 'B', 'C', 'C'],
'col_2': [1, 2, 1, 2, 1, 2]})
I would like to group by col_1
and filter out anything strictly greater than one using col_2
. The final result should look like:
final_df = pd.DataFrame({'col_1': ['A', 'B', 'C'],
'col_2': [1, 1, 1, ]})
Here is what I tried:
df_ts = my_df.groupby('col_1').filter(lambda x: (x['col_2'] <= 1).any())
It returns the same dataframe
I also tried:
df_ts = my_df.groupby('col_1').filter(lambda x: x['col_2'] <= 1)
It generates error.
groupby.filter
filters a full group based on its members.
What you want is simply to filter rows. You do not need groupby
:
out = my_df[my_df['col_2'].le(1)]
Output:
col_1 col_2
0 A 1
2 B 1
4 C 1