I'm working on a classification problem and have multiple fitted sklearn classifiers, like
svm = SVC().fit(X_train, y_train)
dt = tree.DecisionTreeClassifier(criterion='entropy',max_depth=4000).fit(X_train, y_train)
...
for i in range(num_of_models):
m2 = create_model_for_ensemble(dummy_y_train.shape[1])
m2.fit(X_train_array[i], dummy_y_train, epochs=150, batch_size=100, verbose=0)
models.append(m2)
# m2 is a customized Neural Network Classifier, that has a custom predict function (m2.predict_classes)
# The above code is just an example, the point is - m2 is also a classifier.
... etc.
Initially, these all get the same inputs, and all have the same type of outputs, they can all predict a label for a row of my data:
label attribute_1 attribute_2 ... attribute_79 1 ? 0.199574 0.203156 ... 0.046898 2 ? 0.201461 0.203837 ... 0.075002 3 ? 0.209044 0.214268 ... 0.143278 ... ... ... ... ... ...
Where label is a whole number ranging from 0 to 29.
My goal is to build an AdaBoost classifier that includes all of the above (svm, dt, m2), but I haven't been able to find an example on Google; every example just talks about multiple different decision trees, or multiple different (but the same type of) classifiers.
I know it can be done, for each row (or datapoint) of my dataframe, the weights of each classifier have to be adjusted, and that doesn't require for all of them to be the same type of classifier - they all just need to have a .predict method.
So how do I go about doing this? Can anyone give me an example?
To include all clf [svm, dt, m2]
, create an ensemble model at first stage and then feed this ensemble model as the base estimator for adaboost.
Try some thing similar like this!
from sklearn import datasets
from sklearn.ensemble import AdaBoostClassifier,VotingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
X_train, xtest, y_train, y_eval = train_test_split(X, y, test_size=0.2, random_state=42)
iris = datasets.load_iris()
X, y = iris.data[:, 1:3], iris.target
votingClf = VotingClassifier([('clf1',SVC(probability=True)),('clf2',DecisionTreeClassifier())],voting='soft') #
adaBoostClassifier = AdaBoostClassifier(base_estimator = votingClf)
adaBoostClassifier.fit(X,y)