pythonpandaslistdataframeisnan

how to check not na and not empty list in a dataframe column?


d = {'status': {0: 'No', 1: 'No', 2: 'Yes', 3: 'No'}, 'time': {0: "['Morning', 'Midday', 'Afternoon']", 1: nan, 2: "[]", 3: nan}, 'id': {0: 1, 1: 5, 2: 2, 3: 3}}
df = pd.DataFrame(d)

df is the dataframe. All are object types.

I need to check not na and not empty list from all the columns of dataframe. I did below attempts -

df['no_nans'] = ~pd.isna(df).any(axis = 1)
print(df['no_nans'])

True
False
True
False

It should be as below -

True
False
False
False

As the time column has [] blank list in the third row , its not checking through isna().

Is there a simple and easy way to put this check properly? Thanks in advance for any help.


Solution

  • If empty lists/tuples/sets/ strings select these columns by DataFrame.select_dtypes, convert to booleans for Falses if empty and last add missing non object columns by DataFrame.reindex, chain another mask by & for bitwise AND and check if all Trues per rows by DataFrame.all:

    m = (df.select_dtypes(object).astype(bool).reindex(df.columns, axis=1, fill_value=True) & 
         df.notna()).all(axis=1)
    print (m)
    0     True
    1    False
    2    False
    3    False
    dtype: bool
    

    Details:

    print (df.select_dtypes(object))
      status                                time
    0      0  ['Morning', 'Midday', 'Afternoon']
    1     No                                 NaN
    2    Yes                                  []
    3     No                                 NaN
    
    print (df.select_dtypes(object).astype(bool))
       status   time
    0    True   True
    1    True   True
    2    True  False
    3    True   True
    
    print (df.select_dtypes(object).astype(bool).reindex(df.columns, axis=1, fill_value=True))
       status   time    id
    0    True   True  True
    1    True   True  True
    2    True  False  True
    3    True   True  True