pythonmatplotlibscikit-learnsvmsvc

Separate svm classes for matplotlib legend



I am trying to learn sklearn and for this, I am trying a simple exercise with a linear SVM. The SVC tries to predict the number of bedrooms in a house, based on the value of the house and its area.
I have managed to get something that looks ok, but the template I took from matplotlib's documentation uses a color map and I don't know exactly what corresponds to what.

How could I add a legend that specifies what the color of each scattered point corresponds to, and what the SVM's sections correspond to as well?

Also, in order to make the same work, I had to preprocess.scale my features, and the ticks now have the preprocessed value ;( How could I unscale somehow or retrieve the original values to use for the graduation.

Here is the plot:

https://i.sstatic.net/bigiR.png (I don't have enough reputation to post directly)

And here is my code:

style.use('ggplot')

dataset = pd.read_csv('/Path/Paros.csv')
dataset = dataset[dataset['size']<3000]
X = np.array(dataset[['size', 'value']])
y = np.array(dataset[['bedrooms']])
X = preprocessing.scale(X)

h = 0.01  # step size in the mesh
C = 0.01  # SVM regularization parameter
clf = svm.SVC(kernel='linear', C=C).fit(X, y[:,0])

# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
print "mesh"
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                     np.arange(y_min, y_max, h))


Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)
plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.Paired)
plt.xlabel('Size')
plt.ylabel('Price')
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())


plt.show()

Solution

  • plt.colorbar() did what I was looking for.