My task is to create a dendrogram, but the leaf nodes shows blunt edges. How would I extend the length of the leaf node, and add node labels?
Please see the current and expected images below.
Data:
df1 <- data.frame( z1 = c(rep('P1', 5), rep('P2', 5), rep('P3', 3), rep('P4', 4)),
z2 = c(letters[1:5], letters[6:10], letters[11:13], letters[14:17]),
stringsAsFactors = FALSE)
Code:
library('data.table')
library('ggplot2')
library('ggdendro')
library('grid')
setDT(df1)
ddata <- dcast( data = df1[, .(z1, z2)],
formula = z2 ~ z1,
fill = 0,
fun.aggregate = length,
value.var = 'z2')
setDF( ddata)
row.names(ddata) <- ddata$z2
ddata$z2 <- NULL
ddata <- dendro_data( as.dendrogram( hclust( dist( ddata), method = "average")))
p <- ggplot(segment(ddata)) +
geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) +
theme_dendro()
print(p)
Current plot:
Expected Plot:
There are a couple of ways to do this, the simplest is probably to apply a function recursively over the nodes of the dendrogram using dendrapply
.
If you insert a new line to assign the dendrogram object:
dendro <- as.dendrogram(hclust(dist(ddata), method = "average"))
and then create a simple function that reduces the height of leaf nodes by a given amount (d):
dropleaf <- function(x, d = 1){
if(is.leaf(x)) attr(x, "height") <- attr(x, "height") - d
return(x)
}
The function can be applied over all nodes as follows:
dendro <- dendrapply(dendro, dropleaf, d = 0.2)
If you intend to plot the axis you can re-scale the plot so that the lowest point is reset to zero using:
dendro <- phylogram::reposition(dendro, shift = "reset")
You can then proceed with the rest of your code..
ddata <- dendro_data(dendro)
p <- ggplot(segment(ddata)) +
geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) +
theme_dendro()
print(p)
producing the following output: