pythonpandasnan

Count NaNs when using value_counts() on a dataframe


I want to count the number of occurrences over two columns of a DataFrame :

No Name
1   A  
1   A
5   T
9   V
Nan M
5   T
1   A

I expected df[["No", "Name"]].value_counts() to give

No Name Count
1   A     3
5   T     2
9   V     1
Nan M     1

But I am missing the row containing NaN.

Is there a way to include NaNs in value_counts()?


Solution

  • You can use groupby with dropna=False:

    df.groupby(['No', 'Name'], dropna=False, as_index=False).size()
    

    Output:

        No Name  size
    0  1.0    A     3
    1  5.0    T     2
    2  9.0    V     1
    3  NaN    M     1
    

    P.S. Interestingly enough, pd.Series.value_counts method also supports dropna argument, but pd.DataFrame.value_counts method does not


    Update As pointed out in the other answer, value_counts now also supports dropna=False. This was introduced in v1.3.0, which was released after my original answer was posted