pythonpandasdataframenan

How to check if any value is NaN in a Pandas DataFrame


How do I check whether a pandas DataFrame has NaN values?

I know about pd.isnan but it returns a DataFrame of booleans. I also found this post but it doesn't exactly answer my question either.


Solution

  • jwilner's response is spot on. I was exploring to see if there's a faster option, since in my experience, summing flat arrays is (strangely) faster than counting. This code seems faster:

    df.isnull().values.any()
    

    enter image description here

    import numpy as np
    import pandas as pd
    import perfplot
    
    
    def setup(n):
        df = pd.DataFrame(np.random.randn(n))
        df[df > 0.9] = np.nan
        return df
    
    
    def isnull_any(df):
        return df.isnull().any()
    
    
    def isnull_values_sum(df):
        return df.isnull().values.sum() > 0
    
    
    def isnull_sum(df):
        return df.isnull().sum() > 0
    
    
    def isnull_values_any(df):
        return df.isnull().values.any()
    
    
    perfplot.save(
        "out.png",
        setup=setup,
        kernels=[isnull_any, isnull_values_sum, isnull_sum, isnull_values_any],
        n_range=[2 ** k for k in range(25)],
    )
    

    df.isnull().sum().sum() is a bit slower, but of course, has additional information -- the number of NaNs.