I want to make a circular view of a hierarchical clustering with the ggraph
package. With base R plotting the non-circular version is made simple enough:
hcity.D2 <- hclust(UScitiesD, "ward.D2")
plot(hcity.D2)
The main feature I struggle with conceptually is how to label the leaf nodes. The code below creates the network, but it's worthless without that text.
ggraph(graph = hcity.D2, layout = "dendrogram", circular = TRUE) +
geom_edge_link() +
coord_equal() +
theme_void()
I feel like somehow I am supposed to use geom_node_text
to get at the city labels. However, it's not clear to me how to actually figure out what gets included in the underlying tidygraph data set and whether enough is there for adding the labels on just the leaf nodes.
There is plenty of documentation for doing advanced things with ggraph
, but not much seems to exist for tackling routine analyses like performing a routine hierarchical clustering.
You need to extract the leaf labels from the hierarchical clustering object. For this we first need to transform hcity
to be a tidygraph and then map the labels to the nodes. geom_node_text(aes(label = ifelse(leaf, label, NA)))
places labels only on leaf nodes by checking the leaf column. Note, that repel
tries to nudge the labels from overlapping with the plot.
library(ggraph)
library(tidygraph)
library(igraph)
# Perform hierarchical clustering
hcity.D2 <- hclust(UScitiesD, "ward.D2")
#plot(hcity.D2)
# Convert hclust object to a tidygraph
graph <- as_tbl_graph(hcity.D2)
# Create the circular dendrogram with labels for leaf nodes
ggraph(graph, layout = "dendrogram", circular = TRUE) +
geom_edge_link() +
geom_node_text(aes(label = ifelse(leaf, label, NA)), size = 3, repel = TRUE) +
coord_equal() +
theme_void()