I need to calculate the phylogenetic signal of more than 100 variables and store the results 'K stats' and 'p-value' to create a dataframe where I have 3 columns specifying variable names, K-stats, and p-value. I know how to do it for 1 variable but I just don't want to repeat that process 100 times. Also, I think a loop could be a more efficient way to do it avoiding problems.
So this is how I think it should go, I just don't know how to implement it. First, some dummy data:
require(geiger)
require(phytools)
tree<-sim.bdtree(b=0.1,d=0,stop="taxa",n=50,extinct=FALSE)
trait<-matrix(rTraitCont(compute.brlen(tree,power=5),model="BM"),50,10)
trait <- as.data.frame(trait)
rownames(trait)<-tree$tip.label
# This is how it is done for 1 variable at the time:
trait.1 <- setNames(trait$V1, rownames(trait))
trait.1.test <- phylosig(tree, trait.1, method = 'K', test = T)
trait.1.test$K
trait.1.test$P
Then I think that it should be a for loop with this structure:
# list1 <- list()
# List.Of.Kvalues <- list()
# List.Of.Pvalues <- list()
#For loop {
# First I need a list that containes each column with the tree tip names or row names of the original data frame (this two are equal)
# list1 <- list(setName(trait[col1], rownames(trait)))
#Second I will use each list inside list1 to calculate the phylogenetic signal and stored the K value and another with p-values
# List.Of.Kvalues <- phylosig(tree, list1[], method = K, test = T)$K
# List.Of.Pvalues <- phylosig(tree, list1[], method = K, test = T)$P
# }
#Finally create the dataframe
# df <- rbind(colnames(trait),List.Of.Kvalues, List.Of.Pvalues)
My knowledge of how to preform loops is very basic, I hope somebody can help me understand how to build this kind of loops. Thank you!!
Using for
loops :
library(geiger)
library(phytools)
#Initializtion part
tree<-sim.bdtree(b=0.1,d=0,stop="taxa",n=50,extinct=FALSE)
trait<-matrix(rTraitCont(compute.brlen(tree,power=5),model="BM"),50,10)
trait <- as.data.frame(trait)
rownames(trait)<-tree$tip.label
n <- ncol(trait)
Kvalues <- numeric(n)
Pvalues <- numeric(n)
#Loop over each column and get K and p values
for(i in seq_len(n)) {
trait.1 <- setNames(trait[[i]], rownames(trait))
trait.1.test <- phylosig(tree, trait.1, method = 'K', test = T)
Kvalues[i] <- trait.1.test$K
Pvalues[i] <- trait.1.test$P
}
Create a dataframe combining all values
out <- data.frame(colname = names(trait), K = Kvalues, P = Pvalues)