rtapply

R tapply : how to use INDEX names as a FUN additional argument?


I would like to use the names of the INDEX factor in my FUN function in tapply.

My data and function are more complex but here is a simple reproducible example :

data <- data.frame(x <- c(4,5,6,2,3,5,8,1), 
                   name = c("A","B","A","B","A","A","B","B"))
myfun <- function(x){paste("The mean of NAME is ", mean(x))}
tapply(data$x, data$name, myfun)

Result :

                         A                          B 
"The mean of NAME is  4.5"   "The mean of NAME is  4" 

Where I would like NAME to be A or B.


Solution

  • One option would be to pass both the value and and the index column to your function:

    data <- data.frame(
      x = c(4, 5, 6, 2, 3, 5, 8, 1),
      name = c("A", "B", "A", "B", "A", "A", "B", "B")
    )
    
    myfun <- function(x) {
      sprintf("The mean of %s is %f", unique(x[[2]]), mean(x[[1]]))
    }
    
    tapply(data[c("x", "name")], data$name, myfun)
    #>                           A                           B 
    #> "The mean of A is 4.500000" "The mean of B is 4.000000"