rk-meanslabeling

Labeling a particular cluster in K-means in R


What are the modifications in code required if I want to label only datapoints in cluster 3?

> library(datasets)
head(iris)
library(ggplot2)
ggplot(iris, aes(Petal.Length, Petal.Width, color = Species)) + geom_point()
set.seed(20)
irisCluster <- kmeans(iris[, 3:4], 3, nstart = 20)
irisCluster

table(irisCluster$cluster, iris$Species)
    setosa versicolor virginica

irisCluster$cluster <- as.factor(irisCluster$cluster)
ggplot(iris, aes(Petal.Length, Petal.Width, color = irisCluster$cluster)) + geom_point()`

Solution

  • You can turn the labels as blank for cluster which is not 3. You may need to adjust the position of labels based on your actual data.

    library(dplyr)
    library(ggplot2)
    
    iris %>%
      mutate(cluster = irisCluster$cluster, 
             label = replace(Petal.Length, cluster != 3, '')) %>%
      ggplot() + aes(Petal.Length, Petal.Width, color = cluster, label = label) + 
      geom_point() + geom_text(vjust = -0.5, hjust = -0.4)