I have a (symmetric) matrix M
that represents the distance between each pair of nodes. For example,
A B C D E F G H I J K L A 0 20 20 20 40 60 60 60 100 120 120 120 B 20 0 20 20 60 80 80 80 120 140 140 140 C 20 20 0 20 60 80 80 80 120 140 140 140 D 20 20 20 0 60 80 80 80 120 140 140 140 E 40 60 60 60 0 20 20 20 60 80 80 80 F 60 80 80 80 20 0 20 20 40 60 60 60 G 60 80 80 80 20 20 0 20 60 80 80 80 H 60 80 80 80 20 20 20 0 60 80 80 80 I 100 120 120 120 60 40 60 60 0 20 20 20 J 120 140 140 140 80 60 80 80 20 0 20 20 K 120 140 140 140 80 60 80 80 20 20 0 20 L 120 140 140 140 80 60 80 80 20 20 20 0
Is there any method to extract clusters from M
(if needed, the number of clusters can be fixed), such that each cluster contains nodes with small distances between them? In the example, the clusters would be (A, B, C, D)
, (E, F, G, H)
and (I, J, K, L)
.
Hierarchical clustering works directly with the distance matrix instead of the actual observations. If you know the number of clusters, you will already know your stopping criterion (stop when there are k clusters). The main trick here will be to choose an appropriate linkage method. Also, this paper(pdf) gives an excellent overview of all kinds of clustering methods.