I have categorical data (A
, B
, etc.) in which multiple matches can exist within the same field such as A,B
. I would like to break my data into additional rows just for the purpose of counting the number of instances of each value.
df = pd.DataFrame({"Values" : ["A", "B", "C", "A,B"]})
df
Values
0 A
1 B
2 C
3 A,B
Currently:
df["Values"].value_counts()
B 1
A,B 1
A 1
C 1
Name: Values, dtype: int64
My ideal function would work something like this:
df["Values"].value_counts(split = ",")
A 2
B 2
C 1
Name: Values, dtype: int64
Use Series.str.split
and then explode()
print( df['Values'].str.split(',').explode().value_counts() )
Prints:
A 2
B 2
C 1
Name: Values, dtype: int64
EDIT:
df = pd.DataFrame({"Values" : ["A", "B", "C", "A,B"]})
print( df['Values'].str.split(',').explode().value_counts() )