Given a multiPhylo object in R, what's the simplest way to count the number of duplicate topologies. For instance, if I randomly sample from all 15 possible resolutions of a 4 tip topology:
library(ape)
library(phytools)
m <- do.call(c, lapply(1:1000, function(x) multi2di(starTree(c('a','b','c','d')))))
I will have 1000 trees from 15 possible topologies. What's the simplest way to tabulate the count of trees with each topology (i.e. the sum of counts will be 1000).
With smallish trees (< ~20 leaves), you can use the 'TreeTools' package to convert each tree topology to a unique integer:
library('TreeTools')
library('phytools')
m <- do.call(c, lapply(1:1000, function(x) multi2di(starTree(c('a','b','c','d')))))
# Tabulate unique topologies
table(vapply64(m, as.TreeNumber, 1))
You can plot each numbered topology using
topologyToPlot <- 2
plot(as.phylo(topologyToPlot, nTip = 4))
For larger trees, you can ensure that trees with an equivalent topology are represented identically within R by:
(if necessary) ensuring that trees' internal representation of tips is consistent using
m <- RenumberTips(m, m[[1]])
.
reordering trees' internal edge and node numbering using m <- Preorder(m)
.
Trees can then be compared using edge matrices as suggested by user12728748.