pythonscikit-learndata-sciencerandom-foresteli5

eli5 permuter.feature_importances_ returning all zeros


I'm trying to get permutation importances for a RandomForestClassifier on a small sample of data, but while I can get simple feature importances, my permutation importances are coming back as all zeros.

This is the code:

Input1:

X_train_encoded = encoder.fit_transform(X_train1)
X_val_encoded = encoder.transform(X_val1)
model = RandomForestClassifier(n_estimators=300, random_state=25, 
                               n_jobs=-1,max_depth=2)
model.fit(X_train_encoded, y_train1)

Output1:

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=2, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=300,
                       n_jobs=-1, oob_score=False, random_state=25, verbose=0,
                       warm_start=False)

Input2:

permuter = PermutationImportance(
    model,
    scoring='accuracy',
    n_iter=3,
    random_state=25
)
permuter.fit(X_val_encoded, y_val1)

Output2:

PermutationImportance(cv='prefit',
                      estimator=RandomForestClassifier(bootstrap=True,
                                                       ccp_alpha=0.0,
                                                       class_weight=None,
                                                       criterion='gini',
                                                       max_depth=2,
                                                       max_features='auto',
                                                       max_leaf_nodes=None,
                                                       max_samples=None,
                                                       min_impurity_decrease=0.0,
                                                       min_impurity_split=None,
                                                       min_samples_leaf=1,
                                                       min_samples_split=2,
                                                       min_weight_fraction_leaf=0.0,
                                                       n_estimators=300,
                                                       n_jobs=-1,
                                                       oob_score=False,
                                                       random_state=25,
                                                       verbose=0,
                                                       warm_start=False),
                      n_iter=3, random_state=25, refit=True,
                      scoring='accuracy')

(PROBLEM) Input3:

feature_names = X_val_encoded.columns.tolist()
pd.Series(permuter.feature_importances_, feature_names).sort_values()

(PROBLEM) Output3:

Player     0.0
POS        0.0
ATT        0.0
YDS        0.0
TDS        0.0
REC        0.0
YDS.1      0.0
TDS.1      0.0
FL         0.0
FPTS       0.0
Overall    0.0
pos_adp    0.0
dtype: float64

I expect to get values here, but instead I get zeros - am I doing something wrong or is that a possible result?

In: permuter.feature_importances_
Out:array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

Solution

  • Turns out the issue was with the data I was passing in, rather than the code itself.

    The data had fewer than 70 observations, so after I was able to add more observations to it (just under 400), I was able to get permutation importances as expected.