rannotationsvisualizationminimum-spanning-treeape

How to annotate distances of distance matrix to edges of a minimum-spanning-tree in R?


Aloha guys!

I am trying to create a Minimum-Spanning-Tree with ggplot, because I want to make use of the ggplot2 and especially ggnetwork functions like geom_edgelabel() to receive a sophisticated, modifiable tree plot. What I couldn't accomplish is to annotate the distance values, from the distance matrix that underlies the Minimum-Spanning-Tree, to the edges.

First I have the distance matrix based on a matrix containing integers.

set.seed(1)
matrix <- matrix(sample(1:20, 10 * 20, replace = TRUE), nrow = 10, ncol = 20)
dist <- dist(matrix)

I then enter the distance matrix to the ape::mst() function to get a minimum-spanning-tree.

if (!require(ape))
  install.packages('ape')
library(ape)

ape_mst <- ape::mst(dist)

To convert the resulting mst object in a format that is accepted by ggplot() I passed it to igraph::graph.adjacency() to get an igraph object that contains edge information.

if (!require(igraph))
  install.packages('igraph')
library(igraph)

gr_adj <- graph.adjacency(ape_mst, "undirected")

As a last step I used ggnetwork() from the ggnetwork package to get a data frame that I can finally hand over to ggplot() and receive my tree.

if (!require(ggnetwork))
  install.packages('ggnetwork')
library(ggnetwork)

gg <- ggnetwork(gr_adj, arrow.gap = 0, layout = layout_with_fr(gr_adj))

ggplot(gg, aes(x = x, y = y, xend = xend, yend = yend)) + geom_edges() + geom_nodelabel(aes(label = name)) + geom_edgelabel(aes(label = name), size = 2)

Unfortunately the data frame doesn't save any edge information so I cant annotate, e.g. my desired distances. Using the package Claddis I can easily retrieve the distances corresponding to each edge of a minimum-spanning tree given a distance matrix, but I was not able to annotate them due to the structure of the objects that I created beforehand.

if (!require(Claddis))
  install.packages('Claddis')
library(Claddis)

find_minimum_spanning_edges(dist)

Would be happy if someone has a suggestion on how to solve this!

EDIT: With respect to @allan-cameron 's answer I modified my code as follows but it still doesn't allow to annotate the edge lengths in the way I like when using ape:mst instead igraph::mst (see image link below)

set.seed(1)

    dist <- dist(matrix(sample(1:20, 10 * 20, replace = TRUE), nrow = 10, ncol = 20))

    ape_mst <- ape::mst(dist) 

    gr_adj <- graph.adjacency(ape_mst, weighted = TRUE)

    ggraph(gr_adj, layout = "igraph", algorithm = "nicely") +
    geom_edge_link(aes(label = round(weight, 3)),
                 angle_calc = "along", vjust = -0.5) +
    geom_node_label(aes(label = name), size = 2) +
    scale_edge_width_continuous(range = c(0.5, 1.5), guide = "none") +
    theme_graph()

    find_minimum_spanning_edges(dist)

tree plot

edges by find_minimum_spanning_edges

The edge annotations should correspond to the edge length like in this plot which I am trying to reproduce:

target plot


Solution

  • You can do all of this in igraph and ggraph

    library(igraph)
    library(ggraph)
    
    set.seed(2)
    
    matrix(sample(1:20, 10 * 20, replace = TRUE), nrow = 10, ncol = 20) |>
      dist() |>
      as.matrix() |>
      graph.adjacency(weighted = TRUE) |>
      mst() |>
      ggraph(layout = "igraph", algorithm = "kk", weights = weight) +
      geom_edge_link(aes(width = weight, label = round(weight, 1)),
                     angle_calc = "along", vjust = -0.5) +
      geom_node_point(size = 10, fill = "white", shape = 21) +
      geom_node_text(aes(label = name)) +
      scale_edge_width_continuous(range = c(0.5, 1.5), guide = "none") +
      theme_graph() +
      coord_equal()
    

    enter image description here