I'm doing text classification with Naive Bayes in Python and want to figure out which words are used for deciding to what class a text belongs.
I have found this answer https://stackoverflow.com/a/62097661/3992979, but it doesn't help me as my vectorizer doesn't have a get_feature_names()
method and my Naive Bayes classifier no coef_
attribute.
df_train
is a data frame with manually labelled training data df_test
is a data frame with unlabelled data NB should classify. There are two classes only, "terror" 1 for text about terrorism attacks and "terror" 0 for text without that topic.
### Create "Bag of Words"
vec = CountVectorizer(
ngram_range=(1, 3)
)
x_train = vec.fit_transform(df_train.clean_text)
x_test = vec.transform(df_test.clean_text)
y_train = df_train.terror
y_test = df_test.terror
### Train and evaluate the model (Naive Bayes classification)
nb = MultinomialNB()
nb.fit(x_train, y_train)
preds = nb.predict(x_test)
I figured it out with trial-and-error:
### Get the words that trigger the AI detection
features_log_prob = nb.feature_log_prob_
feature_names = vec.get_feature_names_out()
def show_top100(classifier, vectorizer, categories):
feature_names = vectorizer.get_feature_names_out()
for i, category in enumerate(categories):
top100 = np.argsort(classifier.feature_log_prob_[i])[-100:]
print("%s: %s" % (category, " ".join(feature_names[top100])))
show_top100(nb, vec, nb.classes_)