rplotcluster-analysishierarchical-clusteringhclust

Find Accuracy of an agglomeration method?


I have plotted a dendrogram using maximum agglomeration method.

hc <- hclust(distance_matrix, method = "complete")
plot(hc, hang = 0, labels=ilpd_df$Class)

Q1) How can I find the accuracy of this agglomeration method?

Q2) How should one comment on the sensitivity of test data to the agglomeration method?

Thank you =)


Solution

  • Cluster analysis is explorative, not predictive.

    Accuracy makes sense when predicting, but not so much when exploring data. You won't be able to just apply this clustering method to a new data point!

    The closest to accuracy is probably the Rand index if you actually have labeled data. It's the accuracy of predicting for a pair of points if they have the same label, or not.