pythonpandascountcumulative-frequency

Counting frequencies of the corresponding values in pandas [python 3]


I have the dataset with the following values:

var1  var2 
1234  abc
2345  bcs
5678  csd
1234  abc
1234  bcs
5678  csd
1234  bcs
1234  xyz
1234  abc
9101  zzz

I need for every unique value in column var1 to count and show the top 3 frequency counts of the corresponding values in var2, and get the output, for example:

var1  var2  count
1234  abc   3
1234  bcs   2
1234  xyz   1
5678  csd   2
9101  zzz   1

What's the most efficient way of doing that?


Solution

  • You need to include nlargest

    df.groupby('var1').var2.apply(lambda x: x.value_counts().nlargest(3)) \
        .reset_index(name='count').rename(columns={'level_1': 'var2'})
    
       var1 var2  count
    0  1234  abc      3
    1  1234  bcs      2
    2  1234  xyz      1
    3  2345  bcs      1
    4  5678  csd      2
    5  9101  zzz      1