machine-learningensemble-learningmlxtend

How to correctly combine my classifiers?


Now I want to build meta classifier that will take probabilities as input and learn weights of those 2 classifiers. So it will automatically decide how much should I "trust" each of my classifiers.

This model is described here:
http://rasbt.github.io/mlxtend/user_guide/classifier/StackingClassifier/#stackingclassifier
I plan to use mlxtend library, but it seems that StackingClassifier refits models.
I do not want to refit because it takes very huge amount of time.
From the other side I understand that refitting is necessary to "coordinate" work of each classifier and "tune" the whole system.

What should I do in such situation?


Solution

  • I won't talk about mlxtend because I haven't worked with it but I'll tell you the general idea.

    You don't have to refit these models to the training set but you have to refit them to parts of it so you can create out-of-fold predictions.

    Specifically, split your training data in a few pieces (usually 3 to 10). Keep one piece (i.e. fold) as validation data and train both models on the other folds. Then, predict the probabilities for the validation data using both models. Repeat the procedure treating each fold as a validation set. In the end, you should have the probabilities for all data points in the training set.

    Then, you can train a meta-classifier using these probabilities and the ground truth labels. You can use the trained meta-classifier on your new data.