pythonscikit-learnnaivebayes

Extract log probabilities from MulinomialNB


I have a scikit-learn Pipeline made of a feature extractor, and a VotingClassifier, which contains MulinomialNB and some other models. When I train MulinomialNB separately I can extract the log probabilities using nb.feature_log_prob_, but inside a pipeline this attribute is missing.

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.ensemble import VotingClassifier
from sklearn.pipeline import Pipeline

vclf =  Pipeline([
    ('vect', CountVectorizer()),
    ('clf', VotingClassifier(
        estimators=[
            ('nb', MultinomialNB()),
            [...]
        ]
    ))
])
vclf.fit(train_X, train_y)

nb = vclf.named_steps['clf'].estimators[0][1]
nb.feature_log_prob_ 

AttributeError: 'MultinomialNB' object has no attribute 'feature_log_prob_'


Solution

  • According to the documentation, estimators_ is the correct attribute to access the list of fitted sub-estimators of the VotingClassifier. Your code should therefore look like this:

    nb = vclf.named_steps['clf'].estimators_[0]
    print(nb.feature_log_prob_)
    

    The MulinomialNB you accessed with estimators was not fitted and, therefore, did not provide the feature_log_prob_ attribute. That is where the error came from.