pythonpandasnumpy

Convert string based NaN's to numpy NaN's


I have a dataframe with a part of it shown as below:

2016-12-27              NaN
2016-12-28              NaN
2016-12-29              NaN
2016-12-30              NaN
2016-12-31              NaN
Name: var_name, dtype: object

The column contains NaN as strings/objects. How can I convert it to a numpy nan instead. Best would be able to do so when I read in the csv file.


Solution

  • I'd use the converters option in read_csv. In this case, we are aiming to convert the column in question to numeric values and treat everything else as numpy.nan which includes string version of 'NaN'

    converter = lambda x: pd.to_numeric(x, 'coerce')
    df = pd.read_csv(StringIO(txt), delim_whitespace=True, converters={1: converter}, header=None)
    df
    

    enter image description here

    df.dtypes
    
    0     object
    1    float64
    dtype: object