[SOLVED] SHAP - instances that have more than one dimension

SHAP - instances that have more than one dimension

I am very new to SHAP and I would like to give it a try, but I am having some difficulty.

The model is already trained and seems to perform well. I then use the training data to test SHAP with. It looks like so:

   var_Braeburn  var_Cripps Pink  var_Dazzle  var_Fuji  var_Granny Smith  \
0             1                0           0         0                 0   
1             0                1           0         0                 0   
2             0                1           0         0                 0   
3             0                1           0         0                 0   
4             0                1           0         0                 0   

   var_Other Variety  var_Royal Gala (Tenroy)  root_CG202  root_M793  \
0                  0                        0           0          0   
1                  0                        0           1          0   
2                  0                        0           1          0   
3                  0                        0           0          0   
4                  0                        0           0          0   

   root_MM106  ...  frt_BioRich Organic Compost_single  \
0           1  ...                                   0   
1           0  ...                                   0   
2           0  ...                                   0   
3           1  ...                                   0   
4           1  ...                                   0   

   frt_Biomin Boron_single  frt_Biomin Zinc_single  \
0                        0                       1   
1                        0                       0   
2                        0                       0   
3                        0                       0   
4                        0                       0   

   frt_Fertco Brimstone90 sulphur_single  frt_Fertco Guano _single  \
0                                      0                         0   
1                                      0                         0   
2                                      0                         0   
3                                      0                         0   
4                                      0                         0   

   frt_Gro Mn_multiple  frt_Gro Mn_single  frt_Organic Mag Super_multiple  \
0                    0                  0                               0   
1                    1                  0                               1   
2                    1                  0                               1   
3                    1                  0                               1   
4                    1                  0                               1   

   frt_Organic Mag Super_single  frt_Other Fertiliser  
0                             0                     0  
1                             0                     0  
2                             0                     0  
3                             0                     0  
4                             0                     0

I then do explainer = shap.Explainer(model) and shap_values = explainer(X_train)

This runs without error and shap_values gives me this:

.values =
array([[[ 0.00775555, -0.00775555],
        [-0.03221035,  0.03221035],
        [-0.0027203 ,  0.0027203 ],
        ...,
        [ 0.00259787, -0.00259787],
        [-0.00459262,  0.00459262],
        [-0.0303394 ,  0.0303394 ]],

       [[-0.00068313,  0.00068313],
        [-0.03006355,  0.03006355],
        [-0.00245706,  0.00245706],
        ...,
        [-0.00418809,  0.00418809],
        [-0.00088372,  0.00088372],
        [-0.00030019,  0.00030019]],

       [[-0.00068313,  0.00068313],
        [-0.03006355,  0.03006355],
        [-0.00245706,  0.00245706],
        ...,
        [-0.00418809,  0.00418809],
        [-0.00088372,  0.00088372],
        [-0.00030019,  0.00030019]],

       ...,

However, when I then run shap.plots.beeswarm(shap_values), I get the following error:

ValueError: The beeswarm plot does not support plotting explanations with instances that have more than one dimension!

What am I doing wrong here?

Solution

Try this:

from shap import Explainer
from shap.plots import beeswarm
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_breast_cancer

X, y = load_breast_cancer(return_X_y=True, as_frame = True)
model = RandomForestClassifier().fit(X, y)

explainer = Explainer(model)
sv = explainer(X)

Then, retrieve SHAP values just for class 1:

beeswarm(sv[:,:,1])

So, with every model you receive this kind of error message, try reducing dimentionality like

sv[:,:,class_index]