I want to understand to which node my wine is connected after getting a som plot.
That's why firstly we need to get data.frame with the name of wine and the number of cluster that wines belongs to. And next step would be to see the number of the cluster on this plot. But idk how:)
data(wines)
View(wines)
#adding id for each wine
wines<-as.data.frame(wines)
wines$ID <- seq.int(nrow(wines))
#substract the id to know the "name" of wine
som_wines<-wines[,-14]
som_model<-som(scale(som_wines), grid = somgrid(5, 5, "hexagonal"))
som_codes<-as.data.frame(som_model$codes)
#ilustrating needed quantity of clusters
mydata <- as.data.frame(som_model$codes)
wss <- (nrow(mydata)-1)*sum(apply(mydata,2,var))
for (i in 2:15) {
wss[i] <- sum(kmeans(mydata, centers=i)$withinss)
}
plot(wss)
#som plot
som_cluster <- cutree(hclust(dist(som_codes)), 3)
plot(som_model, type="codes",bgcol= som_cluster, main = "Clusters")
add.cluster.boundaries(som_model, som_cluster) `
#Here we got 3 clusters. Creating the dataframe which defines wines id's to cluster groups.
cluster_details <- data.frame(id=wines$ID, cluster=som_cluster[som_model$unit.classif])
And now I want numbers of clusters to be shown there, on the som plot. Are there any suggestions how to cope with that? Would appreciate any answer :)
the answer is situated here: add clusters and nodes from SOMbrero package to training data
Particularly in these lines :
SomModel <- som(
data = TrainingMatrix,
grid = GridDefinition,
rlen = 10000,
alpha = c(0.05, 0.01),
keep.data = TRUE
)
nb <- table(SomModel$unit.classif)
groups = 5
tree.hc = cutree(hclust(d=dist(SomModel$codes[[1]]),method="ward.D2",members=nb),groups)
result <- OrginalData
result$Cluster <- tree.hc[SomModel$unit.classif]
result$X <- SomModel$grid$pts[SomModel$unit.classif,"x"]
result$Y <- SomModel$grid$pts[SomModel$unit.classif,"y"]