pythonpandasdataframefillna

Pandas fillna throws ValueError: fill value must be in categories


Discription: both features are in categorical dtypes. and i used this code in a different kernal of same dateset was working fine, the only difference is the features are in flote64. later i have converted these feature dtypes into Categorical because all the features in the dataset represents categories.

Below is the code:

AM_train['product_category_2'].fillna('Unknown', inplace =True)
AM_train['city_development_index'].fillna('Missing', inplace =True)

Solution

  • Use Series.cat.add_categories for add categories first:

    AM_train['product_category_2'] = AM_train['product_category_2'].cat.add_categories('Unknown')
    AM_train['product_category_2'].fillna('Unknown', inplace =True) 
    
    AM_train['city_development_index'] = AM_train['city_development_index'].cat.add_categories('Missing')
    AM_train['city_development_index'].fillna('Missing', inplace =True)
    

    Sample:

    AM_train = pd.DataFrame({'product_category_2': pd.Categorical(['a','b',np.nan])})
    AM_train['product_category_2'] = AM_train['product_category_2'].cat.add_categories('Unknown')
    AM_train['product_category_2'].fillna('Unknown', inplace =True) 
    
    print (AM_train)
      product_category_2
    0                  a
    1                  b
    2            Unknown