I am very new to SHAP and I would like to give it a try, but I am having some difficulty.
The model is already trained and seems to perform well. I then use the training data to test SHAP with. It looks like so:
var_Braeburn var_Cripps Pink var_Dazzle var_Fuji var_Granny Smith \
0 1 0 0 0 0
1 0 1 0 0 0
2 0 1 0 0 0
3 0 1 0 0 0
4 0 1 0 0 0
var_Other Variety var_Royal Gala (Tenroy) root_CG202 root_M793 \
0 0 0 0 0
1 0 0 1 0
2 0 0 1 0
3 0 0 0 0
4 0 0 0 0
root_MM106 ... frt_BioRich Organic Compost_single \
0 1 ... 0
1 0 ... 0
2 0 ... 0
3 1 ... 0
4 1 ... 0
frt_Biomin Boron_single frt_Biomin Zinc_single \
0 0 1
1 0 0
2 0 0
3 0 0
4 0 0
frt_Fertco Brimstone90 sulphur_single frt_Fertco Guano _single \
0 0 0
1 0 0
2 0 0
3 0 0
4 0 0
frt_Gro Mn_multiple frt_Gro Mn_single frt_Organic Mag Super_multiple \
0 0 0 0
1 1 0 1
2 1 0 1
3 1 0 1
4 1 0 1
frt_Organic Mag Super_single frt_Other Fertiliser
0 0 0
1 0 0
2 0 0
3 0 0
4 0 0
I then do explainer = shap.Explainer(model)
and shap_values = explainer(X_train)
This runs without error and shap_values
gives me this:
.values =
array([[[ 0.00775555, -0.00775555],
[-0.03221035, 0.03221035],
[-0.0027203 , 0.0027203 ],
...,
[ 0.00259787, -0.00259787],
[-0.00459262, 0.00459262],
[-0.0303394 , 0.0303394 ]],
[[-0.00068313, 0.00068313],
[-0.03006355, 0.03006355],
[-0.00245706, 0.00245706],
...,
[-0.00418809, 0.00418809],
[-0.00088372, 0.00088372],
[-0.00030019, 0.00030019]],
[[-0.00068313, 0.00068313],
[-0.03006355, 0.03006355],
[-0.00245706, 0.00245706],
...,
[-0.00418809, 0.00418809],
[-0.00088372, 0.00088372],
[-0.00030019, 0.00030019]],
...,
However, when I then run shap.plots.beeswarm(shap_values)
, I get the following error:
ValueError: The beeswarm plot does not support plotting explanations with instances that have more than one dimension!
What am I doing wrong here?
Try this:
from shap import Explainer
from shap.plots import beeswarm
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer
X, y = load_breast_cancer(return_X_y=True, as_frame = True)
model = RandomForestClassifier().fit(X, y)
explainer = Explainer(model)
sv = explainer(X)
Then, because RF is a bit special, retrieve shap values just for class 1:
beeswarm(sv[:,:,1])
EDIT: Somehow I came to my own post one year later after receiving:
ValueError: The beeswarm plot does not support plotting explanations with instances that have more than one dimension!
So, instead of stating "RF is special", I have to state: "With every model you receive this kind of error message, try reducing dimentionality like sv[:,:,class_index]"