rnodeslabeldendrogrammembers

R: How to extract all labels in a certain node of a dendrogram


I am writing a program that (as a part of it) automatically creates dendrograms from an input dataset. For each node/split I want to extract all the labels that are under that node and the location of that node on the dendrogram plot (for further plotting purposes). So, let's say my data looks like this:

> Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
> dend <- as.dendrogram(hclust(dist(t(Ltrs))))
> plot(dend)

The dendrogram

Now I can extract the location of the splits/nodes:

> library(dendextend)
> nodes <- get_nodes_xy(dend)
> nodes <- nodes[nodes[,2] != 0, ]
> nodes
      [,1]     [,2]
[1,] 1.875 7.071068
[2,] 2.750 3.162278
[3,] 3.500 2.000000

Now I want to get all the labels under a node, for each node (/row from the 'nodes' variable).

This should look something like this:

$`1`
[1] "D" "C" "B" "A"

$`2`
[1] "C" "B" "A"

$`3 `
[1] "B" "A"

Can anybody help me out? Thanks in advance :)


Solution

  • How about something like this?

    library(tidyverse)
    library(dendextend)
    Ltrs <- data.frame("A" = c(3,1), "B" = c(1,1), "C" = c(2,4), "D" = c(6,6))
    dend <- as.dendrogram(hclust(dist(t(Ltrs))))
    
    accumulator <- list();
    myleaves <- function(anode){
        if(!is.list(anode))return(attr(anode,"label"))
        accumulator[[length(accumulator)+1]] <<- (reduce(lapply(anode,myleaves),c))
    }
    
    myleaves(dend);
    ret <- rev(accumulator); #generation was depth first, so root was found last.
    

    Better test this. I am not very trustworthy. In particular, I really hope the list ret is in an order that makes sense, otherwise it's going to be a pain associating the entries with the correct nodes! Good luck.