I'm trying to train a neural network with keras and tesorflow. As usual, I replace -np.inf and np.inf values with np.nan to later run a dropna sequence and clear all that wrong data such as:
Data.replace([np.inf, -np.inf], np.nan, inplace=True)
Data.dropna(inplace=True)
However, after that I couldn't cast the data as float32 as I got the error [when trying to normalize the values]: ValueError: Input contains infinity or a value too large for dtype('float32')
. I tried to cast it to float64, which allowed me to. But then the training processes get weird errors all the time. So I ran the next snippet:
a = np.array([np.finfo(np.float64).max])
print(x > a.any())
and surprisingly I got these result:
[[ True True True ... False False False]
[ True True True ... False False False]
[ True True True ... False False False]
...
[ True True True ... False False False]
[ True True True ... False False False]
[ True True True ... False False False]]
[[False False False ... True True True]
[False False False ... True True True]
[False False False ... True True True]
...
[False False False ... True True True]
[False False False ... True True True]
[False False False ... True True True]]
meaning, there are (True) values bigger than the maximum float64. Isn't it an infinite value? why aren't they replaced with the above code? Is there any way to replace them?
Edit:
I see that the problem is not when casting it as float64 or float32 but when I try to normalize the results with any normalization function (standard, minmax, etc.)
Instead of specifically looking for infinities, just throw out data which is out of bounds, something like this:
bad = Data < -1e20 | Data > 1e20 # use whatever your valid range is
Data.drop(bad.any('columns'), inplace=True)