Working with a DataFrame about places to start and places to go, kind of a Uber app.
It is given to me a DataFrame with 2 columns that have lots of NaN values: Some of them are NaN and other are string types that are written as "nan". An example of how it is written could be this:
data = [
["Sevilla","Granada"],
["nan","Madrid"],
["Burgos",NaN],
["Badajoz","Valencia"],
[NaN,NaN],
[NaN,"nan"],
[NaN,"Barcelona"]
]
df = pd.DataFrame(
data,
columns=["start_station_name", "end_station_name"],
)
print(df)
So you should eliminate both the "nan" and the NaN. The main problem is that when I use .dropna(inplace=True) it eliminates all the values and appears None, how do I keep the good ones?
What I´ve tried is the following.
First, I have replaced all the "nan" with are not NaN values by NotaNumbers. Second, I have dropped all the NaN values, while keeping the name of the cities.
df["start_station_name"] = df["start_station_name"].replace("nan",np.nan)
df["end_station_name"] = df["end_station_name"].replace("nan",np.nan)
df["start_station_name"] = df["start_station_name"].dropna(inplace=True)
df["end_station_name"] = df["end_station_name"].dropna(inplace=True)
print(df[["start_station_name","end_station_name"]])
It shows only None (So all the values are None)
You need to combine replace
and dropna
directly, not at the column/Series level:
out = df.replace('nan', float('nan')).dropna()
Output:
start_station_name end_station_name
0 Sevilla Granada
3 Badajoz Valencia
Your original approach didn't work because you were dropping the NaNs for each column individually, but the columns are reindexed upon insertion back into the DataFrame, which recreates the NaNs.
(not needed but just left here in case it's useful to others)
If for some reason you want to keep the existing string 'nan'
, but still use them to drop the rows in which all values are NaN/'nan':
out = df[df.replace('nan', float('nan')).notna().any(axis=1)]
Output:
start_station_name end_station_name
0 Sevilla Granada
1 nan Madrid # notice the 'nan'
2 Burgos NaN
3 Badajoz Valencia
6 NaN Barcelona