pythonscikit-learnregression

Building a sklearn compatible estimator: 'dict' object has no attribute 'requires_fit'


I am trying to build a scikit-learn compatible estimator. I have built a custom class that inherits from BaseEstimator and RegressorMixin. However, when I try to use this, I run into an AttributeError: 'dict' object has no attribute 'requires_fit' that I do not know how to solve. Here I am posting a minimal code example. The logic to obtain the parameters in this class is irrelevant, I am just interested in getting it to work with the appropiate validations and checks.

from sklearn.utils.validation import check_is_fitted, check_X_y
import numpy as np
from sklearn.base import BaseEstimator, RegressorMixin

class BaseModel(BaseEstimator, RegressorMixin):
    """
    Base class for penalized regression models using cp.
    """
    def __init__(self,
                param: float = 0.5):
        self.param = param
    
    def _obtain_beta(self, X, y):
        self.intercept_ = 0
        self.coef_ = self.param * np.ones(X.shape[0])

    def fit(self, X: np.ndarray, y: np.ndarray):
        self.feature_names_in_ = None
        if hasattr(X, "columns") and callable(getattr(X, "columns", None)):
            self.feature_names_in_ = np.asarray(X.columns, dtype=object)
        X, y = check_X_y(X, y, accept_sparse=False, y_numeric=True, ensure_min_samples=2)
        self.n_features_in_ = X.shape[1]
        # Solve the problem
        self._obtain_beta(X, y)
        self.is_fitted_ = True
        return self

    def predict(self, X: np.ndarray) -> np.ndarray:
        check_is_fitted(self, ["coef_", "intercept_", "is_fitted_"])
        predictions = np.dot(X, self.coef_) + self.intercept
        return predictions

    def __sklearn_tags__(self):
        tags = {
            "allow_nan": False,
            "requires_y": True,  
            "requires_fit": True,     
        }
        return tags

# USAGE EXAMPLE

from sklearn.datasets import make_regression

X, y, beta = make_regression(n_samples=200, n_features=200, n_informative=25, bias=10, noise=5, random_state=42, coef=True)
model = BaseModel()
model.fit(X, y)

Executing this produces this error:

AttributeError: 'dict' object has no attribute 'requires_fit'
File ~\anaconda3\envs\py311env\Lib\site-packages\IPython\core\formatters.py:1036, in MimeBundleFormatter.__call__(self, obj, include, exclude)
1033     method = get_real_method(obj, self.print_method)
1035     if method is not None:
-> 1036         return method(include=include, exclude=exclude)
1037     return None
1038 else:
Show Traceback

It says it has no attribute requires_fit but I am including that attribute in the tags. I am working on a Windows 11 machine with the following requirements:

Python 3.11.12
numpy 2.0.2
scikit-learn 1.6.1

Changing the python or packages version is not an option for me in this case.


Solution

  • It seems they changed code and now it has to use special class Tags instead of dictionary

    def __sklearn_tags__(self):
        tags = super().__sklearn_tags__()
    
        tags.input_tags.allow_nan = False
        tags.target_tags.required = True   # `required` replaces `requires_y`
        tags.requires_fit = True
    
        return tags
    

    Some information I found in file sklearn/utils/_tags.py

    It seems in 1.6.1 still exists function _to_new_tags() which can convert from old tags (dict) to new tags (class Tags) but it is removed in 1.7.0

    Repo for 1.6.X: sklearn/utils/_tags.py