I have a DataFrame df_things
that looks like this and i want to predict the quality of the classification before the training
A B C CLASS
-----------------------
al1 bal1 cal1 Ship
al1 bal1 cal1 Ship
al1 bal2 cal2 Ship
al2 bal2 cal2 Cow
al3 bal3 cal3 Car
al1 bal2 cal3 Car
al3 bal3 cal3 Car
I want to group rows by classes so that i have an idea of the distribution of the features. I do this with (for example, on col "B"),
df_B = df_things.groupby('CLASS').B.value_counts()
which gives me the results
CLASS B
-------------
ship bal1 2
bal2 1
cow bal2 2
car bal2 1
bal3 2
What I want to to is to visualize only groups that have more than one value so that it looks like this:
CLASS B
-------------
ship bal1 2
bal2 1
car bal2 1
bal3 2
I'm a little bit stuck, so any ideas?
You can use groupby
to filter groups that have an nunique
count over 1.
v = df_things.groupby('CLASS').B.value_counts()
v[v.groupby(level=0).transform('nunique').gt(1)]
CLASS B
Car bal3 2
bal2 1
Ship bal1 2
bal2 1
Name: B, dtype: int64