I have a large dataframe with inf, -inf values in different columns. I want to replace all inf, -inf values with NaN
I can do so column by column. So this works:
df['column name'] = df['column name'].replace(np.inf, np.nan)
But my code to do so in one go across the dataframe does not.
df.replace([np.inf, -np.inf], np.nan)
The output does not replace the inf values
df.replace
is fastest for replacing ±inf
mode.use_inf_as_na
inf
and -inf
df = df.replace([np.inf, -np.inf], np.nan)
Just make sure to assign the results back. (Don't use the inplace
approach, which is being deprecated in PDEP-8.)
There are other df.applymap
options, but df.replace
is fastest:
df = df.applymap(lambda x: np.nan if x in [np.inf, -np.inf] else x)
df = df.applymap(lambda x: np.nan if np.isinf(x) else x)
df = df.applymap(lambda x: x if np.isfinite(x) else np.nan)
mode.use_inf_as_na
(deprecated)Note that we don't actually have to modify df
at all. Setting mode.use_inf_as_na
will simply change the way inf
and -inf
are interpreted:
True
means treatNone
,nan
,-inf
,inf
as null
False
meansNone
andnan
are null, butinf
,-inf
are not null (default)
Either enable globally
pd.set_option('mode.use_inf_as_na', True)
Or locally via context manager
with pd.option_context('mode.use_inf_as_na', True):
...