Plotting sample names and symbols in phylogenetic networks using the ggsplitnet function from the R package tanggle

I am using the ggsplitnet() function from the R package tanggle to plot a phylogenetic network. I would like to create a plot that displays both individual sample names and colored symbols at the tips of the network, where the colors represent arbitrarily defined groups.

Here is a simplified example demonstrating the issue:

# Load relevant packages
library(phangorn)
library(ggtree)
library(tanggle)

# Read in a phylogenetic network in nexus format
fdir <- system.file("extdata/trees", package = "phangorn")
Nnet <- phangorn::read.nexus.networx(file.path(fdir, "woodmouse.nxs"))

# Plot the network with sample names at the tips
ggsplitnet(Nnet) +
  geom_tiplab2()

# Define groups
mygroups <- rep(LETTERS[1:3], each = 5)

# Replace sample names with group names
Nnet$translate$label <- mygroups

# Plot the network with different colors for groups
ggsplitnet(Nnet) +
  geom_tippoint(aes(color = label), size = 5)

The issue is that I am unable to plot both individual sample names and colored symbols for groups in the same plot. It appears that the ggsplitnet function only recognizes data stored in the label column in cono.net$translate. For example, I tried adding a new column to the network object like Nnet$translate$groups <- mygroups, but it didn't work.

Any suggestions on how to achieve this would be greatly appreciated!

Solution

One way to achieve this by by converting Nnet to a dataframe using ggtree::fortify(), and then assigning group values to the "isTip" label values as a new column e.g. "group". Based on your comment, it looks like ggtree::fortify() reorders the label values (not sure why). This is why it can be 'risky' to rely on a single vector to assign values in these cases. Some extra wrangling is required to ensure your group values are correct. You can either:

match() the order of label and group values
dplyr::left_join() group values using a reference dataframe that identifies which label values belongs to each group

I have provided the workflow for both. I have also added two plot options:

using a combination of Nnet and dfNnet with ggsplitnet(), which preserves the original label placement
using just dfNnet and ggplot(), which may require tinkering if it doesn't suit your needs.

Note that ggsave()s used for the example plots have been included, lower width/height values may result in labels being cropped.

library(phangorn)
library(ggtree)
library(tanggle)

# Example data
fdir <- system.file("extdata/trees", package = "phangorn")
Nnet <- phangorn::read.nexus.networx(file.path(fdir, "woodmouse.nxs"))

# Define groups
mygroups <- rep(LETTERS[1:3], each = 5)

# Create dataframe of Nnet using fortify()
dfNnet <- fortify(Nnet)

To correctly assign group values to each label, either:

# Ensure mygroups order same as original Nnet order
lab_ord <- match(dfNnet$label[dfNnet$isTip], Nnet$translate$label)

# Add group information to dfNnet
dfNnet$group[dfNnet$isTip] <- mygroups[lab_ord]

or:

# Based on your comment, use an existing reference dataframe which looks similar to this
df <- data.frame(label = Nnet$translate$label,
                 group = rep(LETTERS[1:3], each = 5))

# Join group values to dfNnet
dfNnet <- dplyr::left_join(dfNnet, df, by = "label")

Plot 1:

# Plot using Nnet and dfNnet
ggsplitnet(Nnet) +
  geom_tippoint(data = dfNnet,
                aes(colour = group),
                size = 5) +
  geom_tiplab2(aes(label = label)) +
  coord_cartesian(clip = "off") +
  theme(plot.margin = margin(1, 1, 1, 1, "cm"))

ggsave("plot1.jpg",
       width = 7, 
       height = 7,
       dpi = 150)

Plot 2:

ggplot(dfNnet, aes(x = x, y = y)) +
  geom_segment(aes(xend = xend, yend = yend),
               colour = "grey") +
  geom_point(data = dfNnet[dfNnet$isTip, ],
             aes(colour = group),
             size = 5) +
  geom_text(data = dfNnet[dfNnet$isTip, ],
            aes(label = label),
            vjust = -1,
            hjust = 1) +
  theme_void() +
  coord_cartesian(clip = "off") +
  theme(plot.margin = margin(0, 1, 0, 1, "cm"))

ggsave("plot2.jpg",
       width = 7, 
       height = 7,
       dpi = 150)