rggplot2ggdendro

how to extend the length of leaf node in dendrogram and add node labels


My task is to create a dendrogram, but the leaf nodes shows blunt edges. How would I extend the length of the leaf node, and add node labels?

Please see the current and expected images below.

Data:

df1 <- data.frame( z1 = c(rep('P1', 5), rep('P2', 5), rep('P3', 3), rep('P4', 4)),
                   z2 = c(letters[1:5], letters[6:10], letters[11:13], letters[14:17]),
                   stringsAsFactors = FALSE)

Code:

library('data.table')
library('ggplot2')
library('ggdendro')
library('grid')

setDT(df1)
ddata <- dcast( data = df1[, .(z1, z2)],
                formula = z2 ~ z1, 
                fill = 0, 
                fun.aggregate = length, 
                value.var = 'z2')
setDF( ddata)
row.names(ddata) <- ddata$z2
ddata$z2 <- NULL
ddata <- dendro_data( as.dendrogram( hclust( dist( ddata), method = "average")))
p <- ggplot(segment(ddata)) + 
  geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) + 
  theme_dendro()
print(p)

Current plot:

enter image description here

Expected Plot:

enter image description here


Solution

  • There are a couple of ways to do this, the simplest is probably to apply a function recursively over the nodes of the dendrogram using dendrapply.

    If you insert a new line to assign the dendrogram object:

    dendro <-  as.dendrogram(hclust(dist(ddata), method = "average"))
    

    and then create a simple function that reduces the height of leaf nodes by a given amount (d):

    dropleaf <- function(x, d = 1){
      if(is.leaf(x)) attr(x, "height") <- attr(x, "height") - d
      return(x)
    }
    

    The function can be applied over all nodes as follows:

    dendro <- dendrapply(dendro, dropleaf, d = 0.2)
    

    If you intend to plot the axis you can re-scale the plot so that the lowest point is reset to zero using:

    dendro <- phylogram::reposition(dendro, shift = "reset")
    

    You can then proceed with the rest of your code..

    ddata <- dendro_data(dendro)
    p <- ggplot(segment(ddata)) + 
      geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) + 
      theme_dendro()
    print(p)
    

    producing the following output:

    dendrogram