rdplyr

How to get t.test and dplyr group_by/summarise to be recognized within a function in R?


I created a function so I can perform multiple statistical procedures for several values in a dataset: 1) paired t-test, 2) create a simple boxplot, 3) get summary statistics of patient measures before and after intervention.

Here is my code so far:

stat_func<-function(df, var){
  
  t.test(data=df, var ~ timepoint, paired=TRUE)
  
  
  boxplot(data=df,
          var ~ timepoint, 
          col = c("pink", "#EFC000FF"),
          xlab="",
          ylab="Percent Time Between 70-140")
  
  df%>%
    dplyr::group_by(timepoint)%>%
    dplyr::summarise(min=round(min(var), 2), mean=round(mean(var), 2), max=round(max(var), 2), sd=round(sd(var), 2))
  
  
}

When I run this code, RStudio doesn't show the output for the t-test, it DOES create the boxplot, and the output for dplyr::summarise is identical. It doesn't recognize the "before" versus "after" for the timepoint variable.

Here is example data for one of my variables of interest:

have<-as.data.frame(structure(list(subjectid=structure(c(1, 1, 2, 2, 3, 3, 4,4)), 
                    timepoint=structure(c("before", "after", "before", "after", "before", "after", "before", "after")),
                    estimated_a1c=structure(c(10, 7.5, 10.5, 9.4, 9.8, 7, 9.9, 7.3)))))

When I run my function and then write stat_func(have, have$estimated_a1c) to get the output, in the RStudio console, I don't see any t-test output, it generates the boxplot, and then my summarise results are this but they shouldn't be identical:

  <chr>     <dbl> <dbl> <dbl> <dbl>
1 after         7  8.93  10.5  1.41
2 before        7  8.93  10.5  1.41

Any suggestions for how to get t-test & correct group-by/summarise output would be much appreciated.

Thank you!


Solution

  • Here's a shot at what I think you're getting at.

    library(dplyr)
    
    df <- as.data.frame(structure(
      list(
        subjectid = structure(c(1, 1, 2, 2, 3, 3, 4, 4)),
        timepoint = structure(
          c(
            "before",
            "after",
            "before",
            "after",
            "before",
            "after",
            "before",
            "after"
          )
        ),
        estimated_a1c = structure(c(10, 7.5, 10.5, 9.4, 9.8, 7, 9.9, 7.3))
      )
    ))
    
    # Define function
    stat_func <- function(df, var, time) {
    
      # print boxplot to console
      boxplot(
        formula = as.formula(paste(var, "~", time)),
        data = df,
        col = c("pink", "#EFC000FF"),
        xlab = "",
        ylab = "Percent Time Between 70-140"
      )
      
      # results list for t.test and summary df
      results <- list()
      
      # save t test to list
      results$t.test <- t.test(
        df[[var]][df[[time]] == "before"],
        df[[var]][df[[time]] == "after"],
        paired = TRUE
      )
      
      # save summary df to list
      results$summary <- df %>%
        dplyr::group_by(!!sym(time)) %>%
        dplyr::summarise(
          min = round(min(!!sym(var)), 2),
          mean = round(mean(!!sym(var)), 2),
          max = round(max(!!sym(var)), 2),
          sd = round(sd(!!sym(var)), 2)
        )
      
      return(results)
    }
    
    stat_func(df = df, var = 'estimated_a1c', time = 'timepoint')
    
    $t.test
    
        Paired t-test
    
    data:  df[[var]][df[[time]] == "before"] and df[[var]][df[[time]] == "after"]
    t = 5.7934, df = 3, p-value = 0.01023
    alternative hypothesis: true mean difference is not equal to 0
    95 percent confidence interval:
     1.014025 3.485975
    sample estimates:
    mean difference 
               2.25 
    
    
    $summary
    # A tibble: 2 × 5
      timepoint   min  mean   max    sd
      <chr>     <dbl> <dbl> <dbl> <dbl>
    1 after       7     7.8   9.4  1.09
    2 before      9.8  10.0  10.5  0.31
    
    

    The boxplot should appear in the plots pane, while a list containing the t.test and summary df should print to the console. If you want the boxplot included in the list to save, I have better luck with saving ggplot figures to lists than base boxplots, but I'm sure there's a way.

    A couple of things to note: