machine-learningdeep-learningnlpshapxai

How to get the SHAP value per class?


I want to get shap value per class. I have checked tutorial and I found below example how to do this. However, the code do not work because of shap_value.shape is (10,None,6). 10 is your the number of samples, 4 is class.

import datasets
import pandas as pd
import transformers
import shap

dataset = datasets.load_dataset("emotion", split="train")
data = pd.DataFrame({"text": dataset["text"], "emotion": dataset["label"]})

# load the model and tokenizer
tokenizer = transformers.AutoTokenizer.from_pretrained(
    "nateraw/bert-base-uncased-emotion", use_fast=True
)
model = transformers.AutoModelForSequenceClassification.from_pretrained(
    "nateraw/bert-base-uncased-emotion"
).cuda()

# build a pipeline object to do predictions
pred = transformers.pipeline(
    "text-classification",
    model=model,
    tokenizer=tokenizer,
    device=0,
    return_all_scores=True,
)
explainer = shap.Explainer(pred)
shap_values = explainer(data["text"][:3])
shap.plots.bar(shap_values[:, :, "joy"].mean(0))

Are there any way to get bar plot for per class?


Solution

  • After installing shap 41.0, you have to do : !pip3 install mxnet-mkl==1.6.0 numpy==1.23.1 . After that if you encounter : dtype: np.bool you can change: np.bool_ in source code. Of course you can see some warning however graph will be produced !

    Edit: if you want to use current version only do this : !pip3 install mxnet-mkl==1.6.0 numpy==1.23.1