I'm studying machine learning from handson-ml2 book. The topic is early stopping on stochastic gradient descent. I'm running the following code in Pycharm:
from copy import deepcopy
np.random.seed(42)
m = 100
X = 6 * np.random.rand(m, 1) - 3
y = 2 + X + 0.5 * X**2 + np.random.randn(m, 1)
X_train, X_val, y_train, y_val = train_test_split(X[:50], y[:50].ravel(), test_size = 0.5, random_state = 10)
# Verileri hazırlama
poly_scaler = Pipeline([
("poly_features", PolynomialFeatures(degree = 90, include_bias = False)),
("std_scaler", StandardScaler())
])
X_train_poly_scaled = poly_scaler.fit_transform(X_train)
X_val_poly_scaled = poly_scaler.transform(X_val)
sgd_reg = SGDRegressor(max_iter = 1, tol = -np.infty, warm_start = True,
penalty = None, learning_rate = "constant", eta0 = 0.0005, random_state = 42)
minimum_val_error = float("inf")
best_epoch = None
best_model = None
for epoch in range(1000):
sgd_reg.fit(X_train_poly_scaled, y_train)
y_val_predict = sgd_reg.predict(X_val_poly_scaled)
val_error = mean_squared_error(y_val, y_val_predict)
if val_error < minimum_val_error:
minimum_val_error = val_error
best_epoch = epoch
best_model = deepcopy(sgd_reg)
print("best_epoch:", best_epoch, "best_model:", best_model)
where I'm getting this error:
raise InvalidParameterError(sklearn.utils._param_validation.InvalidParameterError: The 'tol' parameter of SGDRegressor must be a float in the range [0, inf) or None. Got -inf instead.
This error says you can't set the 'tol' parameter to '-inf'. But in the book it seems working. How can I fix this problem?
The book you are referring to where setting the tol
hyperparameter to -inf
appears to work could be due to a difference in versions or implementations in the SGDRegressor
class.
If you'd like to copy the code from the book without making any modifications, make sure you're using the same library that was used in the book. As an extra measure, make sure the versions match. You can check this by doing library.__version__
.
Else if you're fine with making a small change to the code provided, you should be able to simply set tol
to 0
(the behavior should be identical, the model will ignore that parameter and keep on training until max_iter
or some other stopping criterion is met).