pythonscikit-learnclassificationbag

predict continuous values using sklearn bagging classifier


Can I use sklearn's BaggingClassifier to produce continuous predictions? Is there a similar package? My understanding is that the bagging classifier predicts several classifications with different models, then reports the majority answer. It seems like this algorithm could be used to generate probability functions for each classification then reporting the mean value.

trees = BaggingClassifier(ExtraTreesClassifier())
trees.fit(X_train,Y_train)
Y_pred = trees.predict(X_test)

Solution

  • If you're interested in predicting probabilities for the classes in your classifier, you can use the predict_proba method, which gives you a probability for each class. It's a one-line change to your code:

    trees = BaggingClassifier(ExtraTreesClassifier())
    trees.fit(X_train,Y_train)
    Y_pred = trees.predict_proba(X_test)
    

    The shape of Y_pred will be [n_samples, n_classes].

    If your Y_train values are continuous and you want to predict those continuous values (i.e., you're working on a regression problem), then you can use the BaggingRegressor instead.