I am trying to use DBSCAN from scikitlearn to segment an image based on color. The results I'm getting are . As you can see there are 3 clusters. My goal is to separate the buoys in the picture into different clusters. But obviously they are showing up as the same cluster. I've tried a wide range of eps values and min_samples but those two things always cluster together. My code is:
img= cv2.imread("buoy1.jpg)
labimg = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
n = 0
while(n<4):
labimg = cv2.pyrDown(labimg)
n = n+1
feature_image=np.reshape(labimg, [-1, 3])
rows, cols, chs = labimg.shape
db = DBSCAN(eps=5, min_samples=50, metric = 'euclidean',algorithm ='auto')
db.fit(feature_image)
labels = db.labels_
plt.figure(2)
plt.subplot(2, 1, 1)
plt.imshow(img)
plt.axis('off')
plt.subplot(2, 1, 2)
plt.imshow(np.reshape(labels, [rows, cols]))
plt.axis('off')
plt.show()
I assume this is taking the euclidean distance and since its in lab space euclidean distance would be different between different colors. If anyone can give me guidance on this I'd really appreciate it.
Update: The below answer works. Since DBSCAN requires an array with no more then 2 dimensions I concatenated the columns to the original image and reshaped to produce a n x 5 matrix where n is the x dimension times the y dimension. This seems to work for me.
indices = np.dstack(np.indices(img.shape[:2]))
xycolors = np.concatenate((img, indices), axis=-1)
np.reshape(xycolors, [-1,5])
You need to use both color and position.
Right now, you are using colors only.