rcorrelationpearson

Adapt method cor.test for each data frame in a list


I want to adapt the method in cor.test in R for each data frame in a list of data frames.

data(iris)
iris.lst <- split(iris[, 1:2], iris$Species)
options(scipen=999)

normality1 <- lapply(iris.lst, function(x) shapiro.test(x[,1]))
p1 <- as.numeric(unlist(lapply(normality1, "[", c("p.value"))))
normality2 <- lapply(iris.lst, function(x)shapiro.test(x[,2]))
p2 <- as.numeric(unlist(lapply(normality2, "[", c("p.value"))))
try <- ifelse (p1 > 0.05 | p2 > 0.05, "spearman", "pearson")

# Because all of them are spearman:
try[3] <- "pearson"
for (i in 1: length(try)){
   results.lst <- lapply(iris.lst, function(x) cor.test(x[, 1], x[, 2], method=try[i]))
   results.stats <- lapply(results.lst, "[", c("estimate", "conf.int", "p.value"))
   stats <- do.call(rbind, lapply(results.stats, unlist))
   stats
}

But it does not compute for each data frame individual cor.test...

cor.test(iris.lst$versicolor[, 1], iris.lst$versicolor[, 2], method="pearson")`
stats
# Should be spearman corr.coefficient but is pearson

Any advice?


Solution

  • Let me check if I understand what you want to achieve. You have a list of data frames and a list of corresponding methods you want to apply (one methood for each dataframe). If my assumpution is correct, then you need to do something like this (instead of your for loop):

    for (i in 1: length(try)){
      results.lst <- cor.test(iris.lst[[i]][, 1], iris.lst[[i]][, 2], method=try[i])
      print(results.lst)
    }
    

    Edit: There are many ways to get your stats, here's one. But first a couple of notes:

    names(try) <- names(iris.lst)
    t(
      sapply(names(try), 
           function(i) {
             result <- cor.test(iris.lst[[i]][, 1], iris.lst[[i]][, 2], method=try[[i]])
             to_return <- result[c("estimate", "p.value")]
             to_return["conf.int1"] <- ifelse(is.null(result[["conf.int"]]), NA, result[["conf.int"]][1])
             to_return["conf.int2"] <- ifelse(is.null(result[["conf.int"]]), NA, result[["conf.int"]][2])
             return(to_return)
             }
           )
      )
    

    output:

               estimate  p.value           conf.int1 conf.int2
    setosa     0.7553375 0.000000000231671 NA        NA       
    versicolor 0.517606  0.0001183863      NA        NA       
    virginica  0.4572278 0.0008434625      0.2049657 0.6525292