I am having a hard time understanding why my phylo tree (imported as a phyloseq object from QIIME2) appears to have no branch lengths when used in phylosig()
. I am trying to compute the phylogenetic signal of my 16S dataset compared to a single continuous metadata variable. All example datasets are included at the bottom of this question.
glacialpath <- metadata[,c(7)]
phylosig(phylo, glacialpath, method = "K", test = TRUE)
Error in vcv.phylo(tree) : the tree has no branch lengths
In addition: Warning message:
In drop.tip(tree, setdiff(tree$tip.label, names(x))) :
drop all tips of the tree: returning NULL
My tree phylo
was separated from a complete phyloseq object phylo <- phy_tree(physeq = physeq)
and has confirmed branch lengths:
Phylogenetic tree with 1290 tips and 1289 internal nodes.
Tip labels:
e54924083c02bd088c69537d02406eb8, 3112899fc7a2adb7f74a081a82c7cde4, db5d745b02355b6fed513c4953b62326, 2faf2046aa9ab2f6598df79cd52e9c7b, bec8db81cc8ec453136c82ede8327a9f, 601e47b8adcbd21d159f74421710e1b5, ...
Node labels:
0.942, 0.563, 0.711, 0.999, 0.000, 0.528, ...
Rooted; includes branch lengths.
A dput()
of phylo
is too large to include, so hopefully what I have provided is enough. Any help is appreciated.
> dput (metadata)
structure(list(sample = structure(1:48, .Label = c("sample-10",
"sample-11", "sample-12", "sample-14", "sample-16", "sample-18",
"sample-19", "sample-20", "sample-21", "sample-22", "sample-23",
"sample-24", "sample-25", "sample-26", "sample-27", "sample-28",
"sample-29", "sample-30", "sample-31", "sample-32", "sample-33",
"sample-34", "sample-35", "sample-36", "sample-37", "sample-40",
"sample-41", "sample-43", "sample-44", "sample-45", "sample-46",
"sample-50", "sample-55", "sample-56", "sample-57", "sample-58",
"sample-59", "sample-61", "sample-62", "sample-63", "sample-64",
"sample-65", "sample-67", "sample-68", "sample-69", "sample-70",
"sample-71", "sample-8"), class = "factor"), pond = structure(c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L), .Label = c("Lilly",
"RHM1", "RHM2", "SS", "TS"), class = "factor"), meh = structure(c(16L,
17L, 19L, 4L, 1L, 16L, 18L, 20L, 3L, 5L, 7L, 10L, 11L, 22L, 1L,
2L, 14L, 16L, 18L, 20L, 3L, 5L, 7L, 10L, 1L, 15L, 16L, 18L, 19L,
20L, 21L, 8L, 1L, 2L, 14L, 15L, 16L, 18L, 19L, 20L, 21L, 3L,
5L, 6L, 9L, 12L, 13L, 1L), .Label = c("0-1", "1-2.", "10-11.",
"11-12.", "12-13.", "13-14.5", "14-15", "14-16.", "14.5-17.5",
"16-17", "17-19", "17.5-18.5", "18.5-20", "2-3.", "3-4.", "4-5.",
"5-6.", "6-7.", "7-8.", "8-9.", "9-10.", "9-11."), class = "factor"),
depth = c(4, 5, 7, 11, 0, 4, 6, 8, 10, 12, 14, 16, 17, 9,
0, 1, 2, 4, 6, 8, 10, 12, 14, 16, 0, 3, 4, 6, 7, 8, 9, 14,
0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 12, 13, 14, 19, 18, 0), om = structure(c(29L,
13L, 9L, 9L, 27L, 27L, 26L, 26L, 23L, 23L, 12L, 12L, 20L,
34L, 24L, 24L, 22L, 22L, 25L, 25L, 21L, 21L, 11L, 11L, 8L,
7L, 5L, 3L, 3L, 2L, 1L, 4L, 33L, 30L, 31L, 32L, 28L, 18L,
19L, 17L, 15L, 17L, 16L, 14L, 18L, 19L, 10L, 6L), .Label = c("1.2",
"1.3", "1.4", "1.5", "1.7", "19.9", "2.1", "2.6", "2.8",
"2.9", "27.9", "29.8", "3.2", "3.3", "3.4", "3.5", "3.6",
"3.7", "3.8", "30.2", "30.6", "32.1", "32.5", "32.6", "32.9",
"35.7", "36.2", "4.2", "4.3", "5.4", "5.8", "6", "6.4", "na"
), class = "factor"), water = structure(c(19L, 16L, 12L,
13L, 28L, 28L, 35L, 35L, 30L, 30L, 31L, 31L, 28L, 36L, 27L,
27L, 33L, 33L, 34L, 34L, 31L, 31L, 29L, 29L, 20L, 21L, 15L,
5L, 3L, 2L, 1L, 4L, 26L, 23L, 24L, 25L, 22L, 13L, 14L, 7L,
8L, 9L, 11L, 10L, 17L, 18L, 6L, 32L), .Label = c("16.1",
"17", "17.5", "18.8", "21.6", "21.8", "22.3", "22.4", "22.6",
"22.7", "23.1", "23.6", "24.4", "24.5", "24.6", "25.5", "26.9",
"28", "29.1", "31.8", "32.8", "34.9", "40.5", "42.9", "43.2",
"50.9", "59", "60.1", "61.4", "61.5", "62.5", "62.9", "63.2",
"63.8", "66.4", "na"), class = "factor"), glacialpath = c(19.4,
19.4, 19.4, 19.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4,
6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4, 6.4,
18.7, 18.7, 18.7, 18.7, 18.7, 18.7, 18.7, 18.7, 16.8, 16.8,
16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8, 16.8,
16.8, 16.8, 16.8, 19.4), elevation = c(8.2, 8.2, 8.2, 8.2,
18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3,
18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3, 18.3,
13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 13.4, 5.5, 5.5,
5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5, 5.5,
5.5, 8.2), area = c(20536L, 20536L, 20536L, 20536L, 3571L,
3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L,
3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L, 3571L,
3571L, 10369L, 10369L, 10369L, 10369L, 10369L, 10369L, 10369L,
10369L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L,
57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L, 57003L,
20536L), tectonics = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L), .Label = c("outwash",
"uplift"), class = "factor"), lat = c(60.52, 60.52, 60.52,
60.52, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478,
60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478, 60.478,
60.478, 60.478, 60.478, 60.478, 60.478, 60.491, 60.491, 60.491,
60.491, 60.491, 60.491, 60.491, 60.491, 60.425, 60.425, 60.425,
60.425, 60.425, 60.425, 60.425, 60.425, 60.425, 60.425, 60.425,
60.425, 60.425, 60.425, 60.425, 60.52), long = c(145.601,
145.601, 145.601, 145.601, 145.382, 145.382, 145.382, 145.382,
145.382, 145.382, 145.382, 145.382, 145.382, 145.382, 145.382,
145.382, 145.382, 145.382, 145.382, 145.382, 145.382, 145.382,
145.382, 145.382, 145.547, 145.547, 145.547, 145.547, 145.547,
145.547, 145.547, 145.547, 145.473, 145.473, 145.473, 145.473,
145.473, 145.473, 145.473, 145.473, 145.473, 145.473, 145.473,
145.473, 145.473, 145.473, 145.473, 145.601), x.depth = c(1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 1L), y.depth = c(2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 20L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 1L)), class = "data.frame", row.names = c(NA,
-48L))
The problem is not the lack of branch lengths in the tree. What the error message is saying is that the tree does not have the tips that correspond to the names of values in the trait variable. glacialpath
must be a named vector and the names must be present in the phylogeny.
In general, a good practice is to check for possible problems in these three areas.
phylo
is not a correct phylo
format.phylo
should contain tips with names corresponding to those in the metadata
.glacialpath
lacks names.phylo
is not a correct phylo
formatphytools::phylosig
requires that the tree is in correct phylo
format. Try to explore the tree with str(phylo)
, and see whether all values in phylo$edge.length
are numeric and there are no missing values.
phylo
should contain tips with names corresponding to those in the metadata
What are the samples the phylogenetic signal should be calculated for? Assuming that the column sample
in the metadata
contains names, reduce the tree to the size of the available data:
tr2 <- ape::keep.tip(phy = phylo, tip = metadata$sample)
plot(tr2) # to check that the tree was trimmed correctly
While phytools::phylosig
can trim the tree during processing, preparing a subset of the tree beforehand is preferable, because the user can check that the trimmed tree plots as expected, including branch lengths. In the given example, the tree tips are not names present in the trait vector names.
glacialpath
lacks namesThe glacialpath
is not a named vector. To set the names of a vector, use:
glacialpath <- setNames(object = metadata[, 7], nm = metadata$sample)
Lastly, a good practice is to name variables differently to names of functions or, in this case, classes.