I am trying to build a scikit-learn compatible estimator. I have built a custom class that inherits from BaseEstimator
and RegressorMixin
. However, when I try to use this, I run into an AttributeError: 'dict' object has no attribute 'requires_fit'
that I do not know how to solve. Here I am posting a minimal code example. The logic to obtain the parameters in this class is irrelevant, I am just interested in getting it to work with the appropiate validations and checks.
from sklearn.utils.validation import check_is_fitted, check_X_y
import numpy as np
from sklearn.base import BaseEstimator, RegressorMixin
class BaseModel(BaseEstimator, RegressorMixin):
"""
Base class for penalized regression models using cp.
"""
def __init__(self,
param: float = 0.5):
self.param = param
def _obtain_beta(self, X, y):
self.intercept_ = 0
self.coef_ = self.param * np.ones(X.shape[0])
def fit(self, X: np.ndarray, y: np.ndarray):
self.feature_names_in_ = None
if hasattr(X, "columns") and callable(getattr(X, "columns", None)):
self.feature_names_in_ = np.asarray(X.columns, dtype=object)
X, y = check_X_y(X, y, accept_sparse=False, y_numeric=True, ensure_min_samples=2)
self.n_features_in_ = X.shape[1]
# Solve the problem
self._obtain_beta(X, y)
self.is_fitted_ = True
return self
def predict(self, X: np.ndarray) -> np.ndarray:
check_is_fitted(self, ["coef_", "intercept_", "is_fitted_"])
predictions = np.dot(X, self.coef_) + self.intercept
return predictions
def __sklearn_tags__(self):
tags = {
"allow_nan": False,
"requires_y": True,
"requires_fit": True,
}
return tags
# USAGE EXAMPLE
from sklearn.datasets import make_regression
X, y, beta = make_regression(n_samples=200, n_features=200, n_informative=25, bias=10, noise=5, random_state=42, coef=True)
model = BaseModel()
model.fit(X, y)
Executing this produces this error:
AttributeError: 'dict' object has no attribute 'requires_fit'
File ~\anaconda3\envs\py311env\Lib\site-packages\IPython\core\formatters.py:1036, in MimeBundleFormatter.__call__(self, obj, include, exclude)
1033 method = get_real_method(obj, self.print_method)
1035 if method is not None:
-> 1036 return method(include=include, exclude=exclude)
1037 return None
1038 else:
Show Traceback
It says it has no attribute requires_fit
but I am including that attribute in the tags. I am working on a Windows 11 machine with the following requirements:
Python 3.11.12
numpy 2.0.2
scikit-learn 1.6.1
Changing the python or packages version is not an option for me in this case.
It seems they changed code and now it has to use special class Tags
instead of dictionary
def __sklearn_tags__(self):
tags = super().__sklearn_tags__()
tags.input_tags.allow_nan = False
tags.target_tags.required = True # `required` replaces `requires_y`
tags.requires_fit = True
return tags
Some information I found in file sklearn/utils/_tags.py
It seems in 1.6.1
still exists function _to_new_tags()
which can convert from old tags (dict
) to new tags (class Tags
) but it is removed in 1.7.0
Repo for 1.6.X: sklearn/utils/_tags.py