I have a scikit-learn Pipeline
made of a feature extractor, and a VotingClassifier
, which contains MulinomialNB
and some other models. When I train MulinomialNB
separately I can extract the log probabilities using nb.feature_log_prob_
, but inside a pipeline this attribute is missing.
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.ensemble import VotingClassifier
from sklearn.pipeline import Pipeline
vclf = Pipeline([
('vect', CountVectorizer()),
('clf', VotingClassifier(
estimators=[
('nb', MultinomialNB()),
[...]
]
))
])
vclf.fit(train_X, train_y)
nb = vclf.named_steps['clf'].estimators[0][1]
nb.feature_log_prob_
AttributeError: 'MultinomialNB' object has no attribute 'feature_log_prob_'
According to the documentation, estimators_
is the correct attribute to access the list of fitted sub-estimators of the VotingClassifier
. Your code should therefore look like this:
nb = vclf.named_steps['clf'].estimators_[0]
print(nb.feature_log_prob_)
The MulinomialNB
you accessed with estimators
was not fitted and, therefore, did not provide the feature_log_prob_
attribute. That is where the error came from.