machine-learningscikit-learnclassificationrandom-forestadaboost

using random forest as base classifier with adaboost


Can I use AdaBoost with random forest as a base classifier? I searched on the internet and I didn't find anyone who does it.

Like in the following code; I try to run it but it takes a lot of time:

estimators = Pipeline([('vectorizer', CountVectorizer()),
                       ('transformer', TfidfTransformer()),
                       ('classifier', AdaBoostClassifier(learning_rate=1))])

RF=RandomForestClassifier(criterion='entropy',n_estimators=100,max_depth=500,min_samples_split=100,max_leaf_nodes=None,
                          max_features='log2')


param_grid={
    'vectorizer__ngram_range': [(1,2),(1,3)],
    'vectorizer__min_df': [5],
    'vectorizer__max_df': [0.7],
    'vectorizer__max_features': [1500],

    'transformer__use_idf': [True , False],
    'transformer__norm': ('l1','l2'),
    'transformer__smooth_idf': [True , False],
     'transformer__sublinear_tf': [True , False],

    'classifier__base_estimator':[RF],
    'classifier__algorithm': ("SAMME.R","SAMME"),
    'classifier__n_estimators':[4,7,11,13,16,19,22,25,28,31,34,43,50]
}

I tried with the GridSearchCV, I added the RF classifier into the AdaBoost parameters. if I use it would the accuracy increase?


Solution

  • No wonder you have not actually seen anyone doing it - it is an absurd and bad idea.

    You are trying to build an ensemble (Adaboost) which in itself consists of ensemble base classifiers (RFs) - essentially an "ensemble-squared"; so, no wonder about the high computation time.

    But even if it was practical, there are good theoretical reasons not to do it; quoting from my own answer in Execution time of AdaBoost with SVM base classifier:

    Adaboost (and similar ensemble methods) were conceived using decision trees as base classifiers (more specifically, decision stumps, i.e. DTs with a depth of only 1); there is good reason why still today, if you don't specify explicitly the base_classifier argument, it assumes a value of DecisionTreeClassifier(max_depth=1). DTs are suitable for such ensembling because they are essentially unstable classifiers, which is not the case with SVMs, hence the latter are not expected to offer much when used as base classifiers.

    On top of this, SVMs are computationally much more expensive than decision trees (let alone decision stumps), which is the reason for the long processing times you have observed.

    The argument holds for RFs, too - they are not unstable classifiers, hence there is not any reason to actually expect performance improvements when using them as base classifiers for boosting algorithms, like Adaboost.