pythonpython-2.7pandas

Python Pandas find all rows where all values are NaN


So I have a dataframe with 5 columns. I would like to pull the indices where all of the columns are NaN. I was using this code:

nan = pd.isnull(df.all)

but that is just returning false because it is logically saying no not all values in the dataframe are null. There are thousands of entries so I would prefer to not have to loop through and check each entry. Thanks!


Solution

  • It should just be:

    df.isnull().all(1)
    

    The index can be accessed like:

    df.index[df.isnull().all(1)]
    

    Demonstration

    np.random.seed([3,1415])
    df = pd.DataFrame(np.random.choice((1, np.nan), (10, 2)))
    df
    

    enter image description here

    idx = df.index[df.isnull().all(1)]
    nans = df.ix[idx]
    nans
    

    enter image description here


    Timing

    code

    np.random.seed([3,1415])
    df = pd.DataFrame(np.random.choice((1, np.nan), (10000, 5)))
    

    enter image description here