pythonnlplime

LIME gives this error "classifier models without probability scores" in python


it is my first time to use LIME and i have never used any interpretation technique before.

most likeley i am doing something wrong but i cannot figure out what is it.

I tried googling and go through SOF question to find the way to resolve this but did not find anything that can help me.

my dataset df_reps looks like this

Toyota Horse Toyota Gear... Mazda Night King
Green Mazda King Toyota ... Blue Mazda Toyota
...
...
Gear Tyre Toyota Geaer ... Horse Blue Park
Laptop Invoice Toyota ...  Horse Mango Kitkat

and labels to predict, is whether the customer approved of not so the labels are only 0 and 1

Here is my code

def BOW(df):
  CountVec = CountVectorizer() # to use only  bigrams ngram_range=(2,2)
  Count_data = CountVec.fit_transform(df)
  Count_data = Count_data.astype(np.uint8)
  cv_dataframe=pd.DataFrame(Count_data.toarray(), columns=CountVec.get_feature_names_out(), index=df.index)  # <- HERE
  return cv_dataframe.astype(np.uint8)

df = BOW(df_reps)
y = df_Labels    # this is either 0 or 1
X = df
X_train, X_test, y_train, y_test = train_test_split(X, y)

clf = RandomForestClassifier(max_depth=100)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

I converted text into tabular format using BOW

therefore, i will using # Here is the part for LIME

explainer = LimeTabularExplainer(X_train.values, feature_names=X_train.columns, verbose=True, mode='classification')
exp = explainer.explain_instance(X_test.values[1], clf.predict, num_features=10000)

but i am getting this error

NotImplementedError: LIME does not currently support classifier models without probability scores. If this conflicts with your use case, please let us know: https://github.com/datascienceinc/lime/issues/16


Solution

  • The LimeTabularExplainer requires probabilities, not predictions. So instead of passing clf.predict you need to either pass clf.predict_proba or a wrapper function that returns probabilities from features. For example based on this tutorial:

    predict_fn = lambda x: rf.predict_proba(encoder.transform(x))
    exp = explainer.explain_instance(X_test, predict_fn)