I get ValueError: cannot convert float NaN to integer for following:
df = pandas.read_csv('zoom11.csv')
df[['x']] = df[['x']].astype(int)
Update: Using the hints in comments/answers I got my data clean with this:
# x contained NaN
df = df[~df['x'].isnull()]
# Y contained some other garbage, so null check was not enough
df = df[df['y'].str.isnumeric()]
# final conversion now worked
df[['x']] = df[['x']].astype(int)
df[['y']] = df[['y']].astype(int)
For identifying NaN
values use boolean indexing
:
print(df[df['x'].isnull()])
Then for removing all non-numeric values use to_numeric
with parameter errors='coerce'
- to replace non-numeric values to NaN
s:
df['x'] = pd.to_numeric(df['x'], errors='coerce')
And for remove all rows with NaN
s in column x
use dropna
:
df = df.dropna(subset=['x'])
Last convert values to int
s:
df['x'] = df['x'].astype(int)