pythonpandascsv

Pandas: ValueError: cannot convert float NaN to integer


I get ValueError: cannot convert float NaN to integer for following:

df = pandas.read_csv('zoom11.csv')
df[['x']] = df[['x']].astype(int)

Update: Using the hints in comments/answers I got my data clean with this:

# x contained NaN
df = df[~df['x'].isnull()]

# Y contained some other garbage, so null check was not enough
df = df[df['y'].str.isnumeric()]

# final conversion now worked
df[['x']] = df[['x']].astype(int)
df[['y']] = df[['y']].astype(int)

Solution

  • For identifying NaN values use boolean indexing:

    print(df[df['x'].isnull()])
    

    Then for removing all non-numeric values use to_numeric with parameter errors='coerce' - to replace non-numeric values to NaNs:

    df['x'] = pd.to_numeric(df['x'], errors='coerce')
    

    And for remove all rows with NaNs in column x use dropna:

    df = df.dropna(subset=['x'])
    

    Last convert values to ints:

    df['x'] = df['x'].astype(int)