I am trying to get SHAP values for a Gaussian Processes Regression (GPR) model using SHAP library. However, all SHAP values are zero. I am using the example in the official documentation. I only changed the model to GPR.
import sklearn
from sklearn.model_selection import train_test_split
import numpy as np
import shap
import time
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import Matern, WhiteKernel, ConstantKernel
shap.initjs()
X,y = shap.datasets.diabetes()
X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=0.2, random_state=0)
# rather than use the whole training set to estimate expected values, we summarize with
# a set of weighted kmeans, each weighted by the number of points they represent.
X_train_summary = shap.kmeans(X_train, 10)
kernel = Matern(length_scale=2, nu=3/2) + WhiteKernel(noise_level=1)
gp = GaussianProcessRegressor(kernel)
gp.fit(X_train, y_train)
# explain all the predictions in the test set
explainer = shap.KernelExplainer(gp.predict, X_train_summary)
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values, X_test)
Running the above code gives the following plot:
When I use Neural Network or Linear Regression, the above code works fine without problem.
If you have any idea how to solve this issue, please let me know.
Your model doesn't predict anything:
plt.scatter(y_test, gp.predict(X_test));
Train your model properly, like below:
plt.scatter(y_test, gp.predict(X_test));
and you're fine to go:
explainer = shap.KernelExplainer(gp.predict, X_train_summary)
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values, X_test)
Full reproducible example:
import sklearn
from sklearn.model_selection import train_test_split
import numpy as np
import shap
import time
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import WhiteKernel, DotProduct
X,y = shap.datasets.diabetes()
X_train,X_test,y_train,y_test = train_test_split(X, y, test_size=0.2, random_state=0)
X_train_summary = shap.kmeans(X_train, 10)
kernel = DotProduct() + WhiteKernel()
gp = GaussianProcessRegressor(kernel)
gp.fit(X_train, y_train)
explainer = shap.KernelExplainer(gp.predict, X_train_summary)
shap_values = explainer.shap_values(X_test)
shap.summary_plot(shap_values, X_test)