pythonmachine-learningscikit-learn

How to get coefficients and feature importances from MultiOutputRegressor?


I am trying to perform a MultiOutput Regression using ElasticNet and Random Forests as follows:

from sklearn.ensemble import RandomForestRegressor
from sklearn.multioutput import MultiOutputRegressor
from sklearn.linear_model import ElasticNet

X_train, X_test, y_train, y_test = train_test_split(X_features, y, test_size=0.30,random_state=0)

Elastic Net

l1_range=np.arange(0.1,1.05,0.1).tolist()

regr_Enet=ElasticNetCV(cv=5,copy_X=True,n_alphas=100,l1_ratio=l1_range,selection='cyclic',normalize=False,verbose =2,n_jobs=1)

regr_multi_Enet= MultiOutputRegressor(regr_Enet)##ElasticNetCV

regr_multi_Enet.fit(X_train, y_train)

Random Forest

max_depth = 20
number_of_trees=100

regr_multi_RF=MultiOutputRegressor(RandomForestRegressor(n_estimators=number_of_trees,max_depth=max_depth,random_state=0,n_jobs=1,verbose=1))

regr_multi_RF.fit(X_train, y_train)

y_multirf = regr_multi_RF.predict(X_test)

Everything is going well, however I haven't found a way to obtain the coefficients (coef_ ) or most important features (feature_importances_) of the model. When I write:

regr_multi_Enet.coef_
regr_multi_RF.feature_importances_

It shows the following error:

AttributeError: 'MultiOutputRegressor' object has no attribute 'feature_importances_'
AttributeError: 'MultiOutputRegressor' object has no attribute 'coef_'

I have read the documentation on MultiOutputRegressor but I cannot find a way to extract the coefficients. How to retrieve them?


Solution

  • MultiOutputRegressor itself doesn't have these attributes - you need to access the underlying estimators first using the estimators_ attribute (which, although not mentioned in the docs, it exists indeed - see the docs for MultiOutputClassifier). Here is a reproducible example:

    from sklearn.multioutput import MultiOutputRegressor
    from sklearn.ensemble import RandomForestRegressor
    from sklearn.linear_model import ElasticNet
    
    # dummy data
    X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
    W = np.array([[1, 1], [1, 1], [2, 2], [2, 2]])
    
    regr_multi_RF=MultiOutputRegressor(RandomForestRegressor())
    regr_multi_RF.fit(X,W)
    
    # how many estimators?
    len(regr_multi_RF.estimators_)
    # 2
    
    regr_multi_RF.estimators_[0].feature_importances_
    # array([ 0.4,  0.6])
    
    regr_multi_RF.estimators_[1].feature_importances_
    # array([ 0.4,  0.4])
    
    regr_Enet = ElasticNet()
    regr_multi_Enet= MultiOutputRegressor(regr_Enet)
    regr_multi_Enet.fit(X, W)
    
    regr_multi_Enet.estimators_[0].coef_
    # array([ 0.08333333,  0.        ])
    
    regr_multi_Enet.estimators_[1].coef_
    # array([ 0.08333333,  0.        ])