pythonrcharacterna

Removing NA_character_


I'm reading an excel file in some python code. I'm using R's readxl function as it is much faster than pandas, then converting it to a python dataframe using rpy2py

Any blank cells are identified as "NA_character_" in the python dataframe.

I tried to remove these rows using pythons dropna(subset=['col']) with no luck. The cells still show as "NA_character_"

Thank's in advance.

-BigEars


Solution

  • How about this?

    import numpy as np
    
    dataframe.replace('NA_character_', np.nan, inplace = True)
    dataframe.dropna(axis = 'index', inplace = True)