pythonmachine-learningsvmfeature-selectionmlxtend

mlxtend.feature_selection forward selection not working with SVM linear kernel?


So I'm performing a feature selection using SVM with the mlxtend packege. X is a dataframe with the features, y is the target variable. This is part of my code.

from sklearn.svm import SVC
from mlxtend.feature_selection import SequentialFeatureSelector as SFS

def SFFS(X, y, C_GS, gamma_GS, kernel_GS):
    sfs = SFS(SVC(kernel = kernel_GS, C = C_GS, gamma = gamma_GS),
         k_features = (1, num_of_features),
          forward= True,
          floating = False,
          verbose= 2,
          scoring= 'roc_auc',
          #scoring= 'accuracy',
          cv = 10,
          n_jobs= -1
         ).fit(X, y)

    return sfs

def SFFS_lin(X, y, C_GS, kernel_GS):
    sfs = SFS(SVC(kernel = kernel_GS, C = C_GS),
         k_features = (1, num_of_features),
          forward= True,
          floating = False,
          verbose= 2,
          scoring= 'roc_auc',
          cv = 10,
          n_jobs= -1
         ).fit(X, y)
    return sfs

def featureNames(sfs):
    Feature_Names = sfs.k_feature_names_
    return Feature_Names


sfs_lin = SFFS_lin(X, y, 1,'linear')
#sfs_rbf = SFFS(X, y, 1, 'auto', 'rbf')
names = featureNames(sfs_lin)
print(names)

The code starts running, but shortly it freezes here:

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers. [Parallel(n_jobs=-1)]: Done 28 out of 28 | elapsed: 2.5s remaining: 0.0s [Parallel(n_jobs=-1)]: Done 28 out of 28 | elapsed: 2.5s finished

[2021-01-24 00:01:57] Features: 1/28 -- score: 0.6146428161908037[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.

When using rbf kernel, the code runs beautifully. If I change the function to perform a backward elimination by setting the forward parameter to False, it runs beautifully

forward=False,

it runs beautifully. The freezing problem seems to appear when doing forward selection with linear kernel. Is this a stupid bug or I'm missing something trivial?

System info:

Python 3.8.5
scikit-learn 0.24.1
mlxtend 0.18.0

Solution

  • It seems that this is just a stupid bug.

    Swiched the cross-validation

    cv = 10
    

    parameter to 9 and it runs ..